Sep-24-2018, 01:34 PM
I’m developing a project where I have a large .csv file (>3 000 000 lines) with multiple columns. Depending on the input of the user I want to display a certain column as a graph.
I wrote the following code which works but which is very slow. At the start I copy all the data from the csv file to a df. When the user inputs the requested column, I copy only the requested column to a new df and display that df.
Is it possible to make the code run faster?
Now I also want to be able to select a certain time range (index of df) but I don’t see what’s the easiest way to do that. Is there a possibility to select a range of lines and copy only these lines to the new df??
Brecht
I wrote the following code which works but which is very slow. At the start I copy all the data from the csv file to a df. When the user inputs the requested column, I copy only the requested column to a new df and display that df.
Is it possible to make the code run faster?
Now I also want to be able to select a certain time range (index of df) but I don’t see what’s the easiest way to do that. Is there a possibility to select a range of lines and copy only these lines to the new df??
import pandas_datareader.data as web import datetime import dash import dash_core_components as dcc import dash_html_components as html from dash.dependencies import Input, Output import pandas as pd app = dash.Dash() df = pd.read_csv("sitka_weather_2014.csv") df.reset_index(inplace=True) df.set_index('AKST', inplace=True) app.layout = html.Div(children=[ html.Div(children=''' Symbol to graph: '''), dcc.Input(id='input', value='1', type='text'), html.Div(id='output-graph'), ]) @app.callback( Output(component_id='output-graph', component_property='children'), [Input(component_id='input', component_property='value')] ) def update_value(input_data): start = datetime.datetime(2015, 1, 1) end = datetime.datetime.now() cols=[int(input_data)] df_display = df[df.columns[cols]] df_display.columns = ['column1'] return dcc.Graph( id='example-graph2', figure={ 'data': [ {'x': df_display.index, 'y': df_display.column1, 'type': 'line', 'name': input_data}, ], 'layout': { 'title': input_data } } ) if __name__ == '__main__': app.run_server(debug=True)Thanks in advance
Brecht