Presentation by Haicheng Liu “Comparing NetCDF and a multidimensional array database on managing and querying large hydrologic datasets: a case study of SciDB”, October 28, 13.00, TU DELFT


Like many ICT related domains, hydrology enters the era of big data and managing large volume of data is a potential issue facing hydrologists. However at present, hydrologic data research is mostly concerned with data collection, interpretation, modelling and visualization. Management and query of large datasets do not draw much interest. The motivation of this research originates from a specific data management problem reflected by Hydrologic Research B.V. and that is, time series extraction costs intolerable time when the large multidimensional dataset is stored in NetCDF classic or 64-bit offset format. The essence of this issue lies in the contiguous storage structure adopted by NetCDF. So in this research, NetCDF-4 format and a multidimensional array database applying chunked storage structure are benchmarked to learn whether and how chunked storage structure can benefit queries executed by hydrologists. With a step further, NetCDF file based solutions and a database solution are compared comprehensively considering both management and query performance with respect to large hydrologic datasets.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s