VISIONS AND OBJECTIVES

Fulfilling the promise of the big data revolution, the center produces analytical tools to extract knowledge from complex data and deliver BigInsight. Despite extraordinary advances in the collection and processing of information, much of the potential residing in contemporary data sources remains unexploited.

There is a dramatic scope for industries, companies and nations – including Norway – to create value from employing novel ways of analysing complex data. The complexity, diversity and dimensionality of the data, and our partner’s innovation objectives, pose fundamentally new challenges to statistical inference. We develop original, cutting-edge statistical, mathematical and machine learning methods, produce high-quality algorithms implementing these approaches and thereby deliver new, powerful, and operational solutions.

‘In the next 50 years, ample data will be available to measure the performance of algorithms across a whole ensemble of situations. This is a game changer for statistical methodology. Instead of deriving optimal procedures under idealized assumptions within mathematical models, we will rigorously measure performance by empirical methods, based on the entire scientific literature or relevant subsets of it’.
— David Donoho, 50 Years of Data Science. (2015)
People Big Insight.jpg

BigInsight’s research converges on two central innovation themes:

  • personalised solutions: to move away from operations based on average and group behaviour towards individualized actions
  • predicting transient phenomena: to forecast the evolution of unstable phenomena for system or populations, which are not in equilibrium, and to design intervention strategies for their control

Our solutions are significantly better than the state-of-the-art, thanks to brilliant, courageous and creative generic methodologies that extract knowledge from complex data. Generic methodology and their new applications are published on international scientific journals.

Through training, capacity building and outreach, BigInsight contributes to growth and progress in the private and public sector, in science and society at large, preparing a new generation of statisticians and machine learners ready for the knowledge based economy of the future.

‘Statistics is the science of learning from data, and accounting for relevant uncertainties. As such, it permeates the physical, natural, and social sciences, as well as public health, medicine, technology, business, and policy’.
— American Statistical Association

Innovation themes and BigInsight’s objectives

The industrial, business and public partners of BigInsight have different core activities, yet they shall unite in the centre to attack together a set of common challenges, the solution of which will shape their coming years and their enterprise identities:     

  • The first common theme for our partners is to offer a radically new collection of products, services and instruments that adapt to and target individual needs and conditions, thus providing dramatically improved quality and efficacy. Identifying highly specific segments allows tailoring products and services more precisely. From each partner’s perspective, individual stands for customer, user, patient, citizen, ship, company, sensor, smart power meter, tax payer, etc
  • The second common objective for our partners is to empower their own decisions with precise predictions of critical quantities, which are unstable and in dynamic transition, in order to enable intervention and control. Again, the future quantities for each partner are different –customer retention probabilities, cancer survival, electricity prices, probability of success of a new product, recovered taxes, service reliability, etc. 

Therefore BigInsight identifies two Central Innovation Themes that mirror these key challenges and are supported by high quality data of unprecedented availability, at the needed scales.

• Predicting transient phenomena

The modern engineering of measurement instruments, the new demands of markets and society and a widespread focus on data acquisition, give rise to high frequency time series data. As never before, we are able to measure processes evolving while they are not in a stable situation, not in equilibrium. A patient receiving cancer treatment (OUS), a sensor on a ship on sea (ABB, DNV GL), a customer offered products from several providers (Telenor, Gjensidige, DNB), a worker who lost his job (NAV), the price of an asset in a complex market (Norsk Hydro) – are all examples of systems in a transient phase. The objective is to predict the dynamics, the future performance and the next events. Importantly, real time monitoring of such transient behaviour and a causal understanding of the factors which affect the process, allow optimal interventions and prevention. Each BIG INSIGHT partner is monitoring specific processes in time, for precise reasons, and has specific intervention instruments. Norsk Hydro forecasts electricity prices in an evolving market with unregulated energy production sources and is concerned with extreme prices, their variability and their external causes. Interventions for Norsk Hydro mean both control of its own energy production and financial operation on the European energy market. Telenor, DNB, NAV, Skatteetaten and Gjensidige are interested in the prediction of certain behaviours of their customers and service users, finding causes of churn, criminal financial or fraud activities, in order to step in with new prices, products, legal actions or investigations. For OUS and DNV GL, the availability of real time monitoring of patients and healthcare institutions allows completely new screening protocols and treatment monitoring, real time prevention and increased safety, thanks to prompt medical and nursing action. For ABB and DNV GL high dimensional times series are generated by sensors monitoring a ship or an industrial installation, with the purpose of predicting operational drifts or failures and redesigning inspection and maintenance protocols. These examples indicate how diverse the concrete objectives between partners are, but again we see very clear parallels:

  • systems operate in a transient phase, out of equilibrium and exposed to external forcing;
  • in some cases, there are many time series which are very long and with high frequency; in other cases, short and with more irregular measurements;  
  • complex dependence structure between time series;
  • unknown causes of abnormal behaviour;
  • possibilities to intervene to retain control.

The objective of BIG INSIGHT is to develop new statistical methodology that will allow our partners to produce new and more precise predictions in unstable situations, in order to make the right decisions and interventions, thus delivering values for the partners.

• Personalised solutions

The core business and operation of our partners involves interacting with many individual units: at Telenor, millions of individual mobile phone customers are part of a communication network; at Gjensidige, a million policyholders share risks of contingent, uncertain losses; for DNB, millions of customers and companies invest and lend money; at OUS, many thousand individual patients are screened for cancer or treated; at the Norwegian Institute of Public Health individuals are susceptible for infections, or are infectious themselves; NAV supports hundreds of thousands of people with special needs in relation to retirement, the labour market and in challenging life situations; for Skatteetaten, millions of taxpayers (persons, organizations) provide money for the functioning of the state; for DNV GL, individual units produce and utilize energy; for DNV GL and ABB, hundreds of sensors register the functional state of a ship at sea; for DNV GL and OUS, a multitude of sensors monitor safety in healthcare. It is fascinating to see both how different and yet partially exchangeable the specific individual units are for each of our partners. There are many common characteristics:

  • a high number of units/sensors;
  • in some cases, massive data for each unit; in other cases, more limited information;
  • complex dependence structure between units;
  • new data types, new technologies, new regulations are available;
  • in most cases, units have their own intelligence, and are exposed to their environment.

Every partner has specific management objectives for its units, but they share the goal to deeply innovate the management of their units, by recognising similarities and exploiting diversity between units. This will allow personalised marketing, personalised products, personalised prices, personalised risk assessments, personalised fraud assessment, personalised screening, personalised therapy, individualised sensor monitoring, individualised maintenance schemes, individualised power production, and more – each providing value to our partner, to the individuals and to society. The values take different forms for each partner, as we shall detail in section 5, and include better health, reduced churn, strengthened competitiveness, recovered tax evasion, improved fraud detection, and optimised maintenance plans.

Bigdata.jpg

Methods

The traditional tool box of statistics and machine learning will be the fundamental resource, among others in these areas: clustering, focused inference, functional data analysis, graphical models, hierarchical Bayesian models, data integration, model comparison and model improvement, multiple testing, multivariate dependence and copula models, extreme value theory, non-parametric Bayes, non-stationary and non-linear stochastic processes and time series, sequential inference, stochastic geometry and space time models, subsampling and data thinning.