Objective: Develop theory and method to measure utility of a given query Q in a given context C over given data D.
Measures of information content and complexity can be used to optimize ‘matching’ and query utility in a given set of conditions, including context and data.
In WP1 complexity issues concerning queries on large datasets will be studied building on the framework described in Adriaans (2009): Given a certain system S with a certain complexity in the world (i.e. the human brain, climate, DNA, and art style or simply a railroad time table) and a canonical measurement function, i.e. and information channel with certain characteristics that creates a data set D with information 'about' S, under what conditions may we assume that a query Q of a certain form on D indeed returns adequate information about S?
In such we will analyze, amongst other things:
- The conditions under which you can extract 'true' isolated facts from a data set but no general insights,
- the question whether complex systems that are undersampled create powerlaw distributions (see last point i.e. powerlaws have no means), and
- the interplay between model information and complexity in the analysis of various systems (i.e. facticity: noise is complex but has a simple model, fractal structures look complex but are simple, lots of structures in nature are both complex and have complex models, specifically products of evolutionary processes).