pymicra.algs package¶
Submodules¶
pymicra.algs.auxiliar¶
-
pymicra.algs.auxiliar.
applyResult
(result, failed, df, control=None, testname=None, filename=None, failshow=False, index_n=None)¶ Auxiliar function to be used with util.qcontrol
Parameters: - result (bool) – whether the test failed and succeeded
- failed (list) – list of failed variables. None object if the test was successful
- control (dictionary) – dictionary whose keys are the names of the tests and items are lists
- testname (string) – name of the test (has to match control dict)
- filename (string) – name or path or identifier of the file tested
- failshow (bool) – whether to show the failed variables or not
-
pymicra.algs.auxiliar.
first_last
(fname)¶ Returns first and last lines of a file
-
pymicra.algs.auxiliar.
lenYear
(year)¶ Calculates the length of a year in days Useful to figure out if a certain year is a leap year
-
pymicra.algs.auxiliar.
stripDown
(str, final='', args=['_', '-'])¶ Auxiliar function to strip down keywords from symbols
-
pymicra.algs.auxiliar.
testValid
(df_valid, testname='', failverbose=True, passverbose=True, filepath=None)¶ Tests a boolean DataFrane obtained from the test and prints standard output
Parameters: - df_valid (pandas.Series) – series contaning only True or False values for each of the variables, which should be the indexes
- testname (string) – the name of the test that generated the True/False values
- failverbose (bool) – whether to return which variables caused a false result
- passverbose (bool) – whether to print something successful cases
Returns: - result (bool) – True if the run passed the passed
- failed (list) – list of failed variables if result==False. None otherwise.
pymicra.algs.general¶
-
pymicra.algs.general.
classbin
(x, y, bins_number=100, function=<function mean>, xfunction=<function mean>, logscale=True)¶ Separates x and y inputs into bins based on the x array. x and y do not have to be ordered.
Parameters: - x (np.array) – independent variable
- y (np.array) – dependent variable
- bins_number (int) – number of classes (or bins) desired
- function (callable) – funtion to be applied to both x and y-bins in order to smooth the data
- logscale (boolean) – whether or not to use a log-spaced scale to set the bins
Returns: - np.array – x binned
- np.array – y binned
-
pymicra.algs.general.
diff_central
(x, y)¶ Applies the central finite difference scheme
Parameters: - x (array) – independent variable
- y (array) – dependent variable
Returns: dydx – the dependent variable differentiated
Return type: array
-
pymicra.algs.general.
file_len
(fname)¶ Returns length of a file through piping bash’s function wc
Parameters: fname (string) – path of the file
-
pymicra.algs.general.
find_nearest
(array, value)¶ Smart and small function to find the index of the nearest value, in an array, of some other value
Parameters: - array (array) – list or array
- value (float) – value to look for in the array
-
pymicra.algs.general.
fitByDate
(data, degree=1, rule=None)¶ Given a pandas DataFrame with the index as datetime, this routine fit a n-degree polynomial to the dataset
Parameters: - data (pd.DataFrame, pd.Series) – dataframe whose columns have to be fitted
- degree (int) – degree of the polynomial. Default is 1.
- rule (str) – pandas offside string. Ex.: “10min”.
-
pymicra.algs.general.
fitWrap
(x, y, degree=1)¶ A wrapper to numpy.polyfit and numpy.polyval that fits data given an x and y arrays. This is specifically designed to be used with by pandas.DataFrame.apply method
Parameters: - x (array, list) –
- y (array, list) –
- degree (int) –
-
pymicra.algs.general.
get_index
(x, to_look_for)¶ Just like the .index method of lists, except it works for multiple values
Parameters: - x (list or array) – the main array
- to_look_for (list or array) – the subset of the main whose indexes are desired
Returns: indexes – array with the indexes of each element in y
Return type: np.array
-
pymicra.algs.general.
get_notation
(notation_def)¶ Auxiliar function ro retrieve notation
-
pymicra.algs.general.
inverse_normal_cdf
(mu, sigma)¶ Applied the inverse normal cumulative distribution
mu: mean sigma: standard deviation
-
pymicra.algs.general.
latexify
(variables, math_mode=True)¶
-
pymicra.algs.general.
limitedSubs
(data, max_interp=3, func=<function <lambda>>)¶ Substitute elements for NaNs if a certain conditions given by fund is met at a maximum of max_interp times in a row. If there are more than that number in a row, then they are not substituted.
Parameters: - data (pandas.dataframe) – data to be interpolated
- max_interp (int) – number of maximum NaNs in a row to interpolate
- func (function) – function of x only that determines the which elements become NaNs. Should return only True or False.
Returns: df – dataframe with the elements substituted
Return type: pandas.dataframe
-
pymicra.algs.general.
limited_interpolation
(data, maxcount=3)¶ Interpolates linearly but only if gap is smaller of equal to maxcout
Parameters: - data (pandas.DataFrame) – dataset to interpolate
- maxcount (int) – maximum number of consecutive NaNs to interpolate. If the number is smaller than that, nothing is done with the points.
-
pymicra.algs.general.
line2date
(line, dlconfig)¶ Gets a date from a line of file according to dataloggerConfig object.
Parameters: - line (string) – line of file with date inside
- dlconfig (pymicra.dataloggerConfig) – configuration of the datalogger
Returns: timestamp
Return type: datetime object
-
pymicra.algs.general.
mad
(data, axis=None)¶
-
pymicra.algs.general.
name2date
(filename, dlconfig)¶ Gets a date from a the name of the file according to a datalogger config object
Parameters: - filename (string) – the (base) name of the file
- dlconfig (pymicra.dataloggerConfig) – configuration of the datalogger
Returns: - cdate (datetime object)
- Warning: Needs to be optimized in order to read question markers also after the date
-
pymicra.algs.general.
parseDates
(data, dataloggerConfig=None, date_col_names=None, clean=True, verbose=False, connector='')¶ Author: Tomas Chor date: 2015-08-10 This routine parses the date from a pandas DataFrame when it is divided into several columns
Parameters: - data (pandas DataFrame) – dataFrame whose dates have to be parsed
- date_col_names (list) – A list of the names of the columns in which the date is divided the naming of the date columns must be in accordance with the datetime directives, so if the first column is only the year, its name must be %Y and so forth. see https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
- connector (string) – should be used only when the default connector causes some conflit
- first_time_skip (int) – the offset (mostly because of the bad converting done by LBA
- clean (bool) – remove date columns from data after it is introduced as index
Returns: data indexed by timestamp
Return type: pandas.DataFrame
-
pymicra.algs.general.
resample
(df, rule, how=None, **kwargs)¶ Extends pandas resample methods to index made of integers
-
pymicra.algs.general.
splitData
(data, rule='30min', return_index=False, **kwargs)¶ Splits a given pandas DataFrame into a series of “rule”-spaced DataFrames
Parameters: - data (pandas dataframe) – data to be split
- rule (str or int) –
- If it is a string, it should be a pandas string offset.
- Some possible values (that should be followed by an integer) are: D calendar day frequency W weekly frequency M month end frequency MS month start frequency Q quarter end frequency BQ business quarter endfrequency QS quarter start frequency A year end frequency AS year start frequency H hourly frequency T minutely frequency Min minutely frequency S secondly frequency L milliseconds U microseconds
If it is a int, it should be the number of lines desired in each separated piece.
If it is None, then the dataframe isn’t separated and a list containing only the full dataframe is returned.
check it complete at http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
pymicra.algs.numeric¶
pymicra.algs.units¶
-
pymicra.algs.units.
add
(elems, units, inplace_units=False, unitdict=None, key=None)¶ Add elements considering their units
-
pymicra.algs.units.
convert_cols
(data, guide, units, inplace_units=False)¶ Converts data from one unit to the other
Parameters: - data (pandas.DataFrame) – to be chanhed from one unit to the other
- guide (dict) – {names of columns : units to converted to}
- units (dict) – units dictionary
- inplace_units (bool) – if inunit is a dict, the dict is update in place. “key” keyword must be provided
-
pymicra.algs.units.
convert_indexes
(data, guide, units, inplace_units=False)¶ Converts data from one unit to the other
Parameters: - data (pandas.Series) – to be chanhed from one unit to the other
- guide (dict) – {names of columns : units to converted to}
- units (dict) – units dictionary
- inplace_units (bool) – if inunit is a dict, the dict is update in place. “key” keyword must be provided
-
pymicra.algs.units.
convert_to
(data, inunit, outunit, inplace_units=False, key=None)¶ Converts data from one unit to the other
Parameters: - data (pandas.series) – to be chanhed from one unit to the other
- inunit (pint.quantity or dict) – unit(s) that the data is in
- outunit (str) – convert to this unit
- inplace_units (bool) – if inunit is a dict, the dict is update in place. “key” keyword must be provided
- key (str) – if inunit is a dict, it is the name of the variable to be changed
-
pymicra.algs.units.
divide
(elems, units, inplace_units=False, unitdict=None, key=None)¶ Divide elements considering their units
-
pymicra.algs.units.
multiply
(elems, units, inplace_units=False, unitdict=None, key=None)¶ Multiply elements considering their units
-
pymicra.algs.units.
operate
(elems, units, inplace_units=False, unitdict=None, key=None, operation='+')¶ Operate on elements considering their units
Parameters: - elems (list, tuple) – list of pandas.Series
- units (list, tuple) – list of pint.units ordered as the elems list
- inplace_units (bool) – sets dictionary inplace_units
- unitdict (dict) – dict to be set inplace
- key (str) – name of variables to be set inplace as dict key
-
pymicra.algs.units.
parseUnits
(unitstr)¶ Gets unit from string, list of strings, or dict’s values, using the UnitRegistry defined in __init__.py
-
pymicra.algs.units.
with_units
(data, units)¶ Wrapper around toUnitsCsv to create a method to print the contents of a dataframe plus its units into a unitsCsv file.
Parameters: - self (pandas.DataFrame, pandas.Series) – dataframe or series to which units belong
- units (dict) – dictionary with the names of each column and their unit