reports¶
-
TH.reports.
geolocate
(*args, **kwargs) → pandas.core.frame.DataFrame¶ Adds geolocation info to input DataFrame rows.
- Parameters
input – pandas Dataframe to add geolocation info
ip_column – Name of the input column containing IPs to geolocate
- Returns
merge of input pandas DataFrame and geolocation info
-
TH.reports.
group
(*args, **kwargs) → pandas.core.frame.DataFrame¶ Obtains a report pandas.Dataframe out of the given dataframe grouping and counting by the given column values.
- Parameters
input – the source dataframe
by – List of columns of the source dataframe used to group the rows
sum – When defined will perform the sum of values in the specified column (instead of counting)
name – Name of the additional column created with the count for each of the grouped rows
Code example:
def do_report(source_dataframe) return reports.group(source_dataframe, by=['MUID', 'UserName'], name='Actions')
- Returns
Dataframe with the resulting data or exception
-
TH.reports.
profile
(left: pandas.core.frame.DataFrame, right: pandas.core.frame.DataFrame, column: str) → pandas.core.frame.DataFrame¶ Obtains a pandas.DataFrame as the result of profiling two (left and right) dataframes.
Both dataframes must be of the same type or else the operation will fail.The resulting dataframe will contain the source rows except those of the train dataframe whose values (for the given column) match the source ones- Parameters
left – The source dataframe with the current data
right – The dataframe against we will perform the profiling
column – The column we will use to perform the profiling
Code example:
def do_profile(train_period, test_period) df1 = obtain_dataframe(period=train_period) df2 = obtain_dataframe(period=test_period) return reports.profile(df1, df2, 'column_name')
- Returns
DataFrame with the resulting data or exception
-
TH.reports.
top
(*args, **kwargs) → pandas.core.frame.DataFrame¶ Obtains a pandas.DataFrame with the top results for a given one
- Parameters
input – the source dataframe
n – Number of rows for the resulting dataframe
by – Name of the column used to order the dataframe results
ascending – True to return the ‘n’ greater results according to column ‘by’ and false for the ‘n’ lowest
Code example:
def do_top(source_dataframe) return reports.top(source_dataframe, n=10, by='Actions', ascending=False)
- Returns
DataFrame with the resulting data or exception