pymicra.algs package¶

Submodules¶

pymicra.algs.auxiliar¶

pymicra.algs.auxiliar.applyResult(result, failed, df, control=None, testname=None, filename=None, failshow=False, index_n=None)¶

Auxiliar function to be used with util.qcontrol

Parameters:

result (bool) – whether the test failed and succeeded
failed (list) – list of failed variables. None object if the test was successful
control (dictionary) – dictionary whose keys are the names of the tests and items are lists
testname (string) – name of the test (has to match control dict)
filename (string) – name or path or identifier of the file tested
failshow (bool) – whether to show the failed variables or not

pymicra.algs.auxiliar.first_last(fname)¶: Returns first and last lines of a file

pymicra.algs.auxiliar.lenYear(year)¶: Calculates the length of a year in days Useful to figure out if a certain year is a leap year

pymicra.algs.auxiliar.stripDown(str, final='', args=['_', '-'])¶: Auxiliar function to strip down keywords from symbols

pymicra.algs.auxiliar.testValid(df_valid, testname='', failverbose=True, passverbose=True, filepath=None)¶

Tests a boolean DataFrane obtained from the test and prints standard output

Parameters:

df_valid (pandas.Series) – series contaning only True or False values for each of the variables, which should be the indexes
testname (string) – the name of the test that generated the True/False values
failverbose (bool) – whether to return which variables caused a false result
passverbose (bool) – whether to print something successful cases

Returns:

result (bool) – True if the run passed the passed
failed (list) – list of failed variables if result==False. None otherwise.

pymicra.algs.general¶

pymicra.algs.general.classbin(x, y, bins_number=100, function=<function mean>, xfunction=<function mean>, logscale=True)¶

Separates x and y inputs into bins based on the x array. x and y do not have to be ordered.

Parameters:

x (np.array) – independent variable
y (np.array) – dependent variable
bins_number (int) – number of classes (or bins) desired
function (callable) – funtion to be applied to both x and y-bins in order to smooth the data
logscale (boolean) – whether or not to use a log-spaced scale to set the bins

Returns:

np.array – x binned
np.array – y binned

pymicra.algs.general.diff_central(x, y)¶

Applies the central finite difference scheme

Parameters:	x (array) – independent variable y (array) – dependent variable
Returns:	dydx – the dependent variable differentiated
Return type:	array

pymicra.algs.general.file_len(fname)¶

Returns length of a file through piping bash’s function wc

Parameters:	fname (string) – path of the file

pymicra.algs.general.find_nearest(array, value)¶

Smart and small function to find the index of the nearest value, in an array, of some other value

Parameters:	array (array) – list or array value (float) – value to look for in the array

pymicra.algs.general.fitByDate(data, degree=1, rule=None)¶

Given a pandas DataFrame with the index as datetime, this routine fit a n-degree polynomial to the dataset

Parameters:	data (pd.DataFrame, pd.Series) – dataframe whose columns have to be fitted degree (int) – degree of the polynomial. Default is 1. rule (str) – pandas offside string. Ex.: “10min”.

pymicra.algs.general.fitWrap(x, y, degree=1)¶

A wrapper to numpy.polyfit and numpy.polyval that fits data given an x and y arrays. This is specifically designed to be used with by pandas.DataFrame.apply method

Parameters:	x (array, list) – y (array, list) – degree (int) –

pymicra.algs.general.get_index(x, to_look_for)¶

Just like the .index method of lists, except it works for multiple values

Parameters:	x (list or array) – the main array to_look_for (list or array) – the subset of the main whose indexes are desired
Returns:	indexes – array with the indexes of each element in y
Return type:	np.array

pymicra.algs.general.get_notation(notation_def)¶: Auxiliar function ro retrieve notation

pymicra.algs.general.inverse_normal_cdf(mu, sigma)¶

Applied the inverse normal cumulative distribution

mu: mean sigma: standard deviation

pymicra.algs.general.latexify(variables, math_mode=True)¶

pymicra.algs.general.limitedSubs(data, max_interp=3, func=<function <lambda>>)¶

Substitute elements for NaNs if a certain conditions given by fund is met at a maximum of max_interp times in a row. If there are more than that number in a row, then they are not substituted.

Parameters:	data (pandas.dataframe) – data to be interpolated max_interp (int) – number of maximum NaNs in a row to interpolate func (function) – function of x only that determines the which elements become NaNs. Should return only True or False.
Returns:	df – dataframe with the elements substituted
Return type:	pandas.dataframe

pymicra.algs.general.limited_interpolation(data, maxcount=3)¶

Interpolates linearly but only if gap is smaller of equal to maxcout

Parameters:	data (pandas.DataFrame) – dataset to interpolate maxcount (int) – maximum number of consecutive NaNs to interpolate. If the number is smaller than that, nothing is done with the points.

pymicra.algs.general.line2date(line, dlconfig)¶

Gets a date from a line of file according to dataloggerConfig object.

Parameters:	line (string) – line of file with date inside dlconfig (pymicra.dataloggerConfig) – configuration of the datalogger
Returns:	timestamp
Return type:	datetime object

pymicra.algs.general.mad(data, axis=None)¶

pymicra.algs.general.name2date(filename, dlconfig)¶

Gets a date from a the name of the file according to a datalogger config object

Parameters:

filename (string) – the (base) name of the file
dlconfig (pymicra.dataloggerConfig) – configuration of the datalogger

Returns:

cdate (datetime object)
Warning: Needs to be optimized in order to read question markers also after the date

pymicra.algs.general.parseDates(data, dataloggerConfig=None, date_col_names=None, clean=True, verbose=False, connector='')¶

Author: Tomas Chor date: 2015-08-10 This routine parses the date from a pandas DataFrame when it is divided into several columns

Parameters:	data (pandas DataFrame) – dataFrame whose dates have to be parsed date_col_names (list) – A list of the names of the columns in which the date is divided the naming of the date columns must be in accordance with the datetime directives, so if the first column is only the year, its name must be %Y and so forth. see https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior connector (string) – should be used only when the default connector causes some conflit first_time_skip (int) – the offset (mostly because of the bad converting done by LBA clean (bool) – remove date columns from data after it is introduced as index
Returns:	data indexed by timestamp
Return type:	pandas.DataFrame

pymicra.algs.general.resample(df, rule, how=None, **kwargs)¶: Extends pandas resample methods to index made of integers

pymicra.algs.general.splitData(data, rule='30min', return_index=False, **kwargs)¶

Splits a given pandas DataFrame into a series of “rule”-spaced DataFrames

Parameters:

data (pandas dataframe) – data to be split
rule (str or int) –

If it is a string, it should be a pandas string offset.

Some possible values (that should be followed by an integer) are: D calendar day frequency W weekly frequency M month end frequency MS month start frequency Q quarter end frequency BQ business quarter endfrequency QS quarter start frequency A year end frequency AS year start frequency H hourly frequency T minutely frequency Min minutely frequency S secondly frequency L milliseconds U microseconds

If it is a int, it should be the number of lines desired in each separated piece.

If it is None, then the dataframe isn’t separated and a list containing only the full dataframe is returned.

check it complete at http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases

pymicra.algs.numeric¶

pymicra.algs.units¶

pymicra.algs.units.add(elems, units, inplace_units=False, unitdict=None, key=None)¶: Add elements considering their units

pymicra.algs.units.convert_cols(data, guide, units, inplace_units=False)¶

Converts data from one unit to the other

Parameters:	data (pandas.DataFrame) – to be chanhed from one unit to the other guide (dict) – {names of columns : units to converted to} units (dict) – units dictionary inplace_units (bool) – if inunit is a dict, the dict is update in place. “key” keyword must be provided

pymicra.algs.units.convert_indexes(data, guide, units, inplace_units=False)¶

Converts data from one unit to the other

Parameters:	data (pandas.Series) – to be chanhed from one unit to the other guide (dict) – {names of columns : units to converted to} units (dict) – units dictionary inplace_units (bool) – if inunit is a dict, the dict is update in place. “key” keyword must be provided

pymicra.algs.units.convert_to(data, inunit, outunit, inplace_units=False, key=None)¶

Converts data from one unit to the other

Parameters:	data (pandas.series) – to be chanhed from one unit to the other inunit (pint.quantity or dict) – unit(s) that the data is in outunit (str) – convert to this unit inplace_units (bool) – if inunit is a dict, the dict is update in place. “key” keyword must be provided key (str) – if inunit is a dict, it is the name of the variable to be changed

pymicra.algs.units.divide(elems, units, inplace_units=False, unitdict=None, key=None)¶: Divide elements considering their units

pymicra.algs.units.multiply(elems, units, inplace_units=False, unitdict=None, key=None)¶: Multiply elements considering their units

pymicra.algs.units.operate(elems, units, inplace_units=False, unitdict=None, key=None, operation='+')¶

Operate on elements considering their units

Parameters:	elems (list, tuple) – list of pandas.Series units (list, tuple) – list of pint.units ordered as the elems list inplace_units (bool) – sets dictionary inplace_units unitdict (dict) – dict to be set inplace key (str) – name of variables to be set inplace as dict key

pymicra.algs.units.parseUnits(unitstr)¶: Gets unit from string, list of strings, or dict’s values, using the UnitRegistry defined in __init__.py

pymicra.algs.units.with_units(data, units)¶

Wrapper around toUnitsCsv to create a method to print the contents of a dataframe plus its units into a unitsCsv file.

Parameters:	self (pandas.DataFrame, pandas.Series) – dataframe or series to which units belong units (dict) – dictionary with the names of each column and their unit

Table Of Contents

This Page

Table Of Contents

Previous topic

Next topic

pymicra.algs package¶

Submodules¶

pymicra.algs.auxiliar¶

pymicra.algs.general¶

pymicra.algs.numeric¶

pymicra.algs.units¶

Module contents¶