Data Science

What's it about?

Data science is a recent term. Nate Silver said, as part of his keynote address in 2013, “I think data-scientist is a sexed-up term for a statistician”, but by 2014, the American Statistical Association renamed one of its journals to "Statistical Analysis and Data Mining: The ASA Data Science Journal", and by 2016 had changed it to "Statistical Learning and Data Science". Data science is about large data sets, and it is also about data and information from lots of settings.

Data scientists are sought after, as more businesses, government agencies, and international organisations have massive computing power combined with equally massive data sets (everything from digital photos to personal data) available to them, and want to apply machine-learning techniques to a wide range of problems. For this reason, a 2012 Harvard Business Review article was headed "Data Scientist: The Sexiest Job of the 21st Century". Universities are now setting up data science learning programmes to meet this need. Data scientists are now the most desirable (and well paid) jobs in the US!

Data scientists’ most basic, universal skill is the ability to write code. They also need to communicate in language that all their stakeholders understand—to tell stories with data, whether verbally, visually, and ideally both as an infographic. Most importantly, they need to be intensely curious—to have a desire to go beneath the surface of a problem, find the questions at its heart, and distill them into a very clear set of hypotheses that can be tested. This curiosity characterises the most creative scientists in any field.

What's driving this?

Increasingly we see evidence of how data science and analytics are used in decision making across multiple sectors today, and expert opinion on what the future might hold. The focus on data science in this trend recognises the related areas of Big Data and Data Analytics that have been the focus of previous trends in the CORE Ten Trends series.

We need to know how we can benefit from data science, while at the same time being aware of how it impacts on our behaviour. For example, it can make our life easier:

  • Internet search — every search engine nowadays is making use of data to provide you with the best results – in just seconds!
  • Digital advertisements — ever wondered how the sites you visit know how to display advertisements for the sorts of things you are interested in?
  • Recommender systems — think of how Amazon is able to recommend other books for you based on what you’ve been searching for, or how Event Netflix now uses a percentage system to show the match of a show to ones you have watched before.
  • Price comparison websites — aggregating data from dozens of other sites and representing it to you in seconds.

We can also be misled as information, ideas, or beliefs are amplified or reinforced by communication and repetition inside a defined system (such as Facebook). Those who write about this have used the metaphor of an echo chamber — a situation in which official sources often go unquestioned and different or competing views are censored, disallowed, or otherwise underrepresented.

Machine learning has enabled us to unknowingly ignore the diversity around us. It’s bad enough being uninformed about a topic you’re passionate about. It’s far worse to falsely believe you’re fully informed on that topic.

Blame Machine Learning for Your Echo Chamber by Derek Hsiang

What examples of this can I see?

One of the benefits of data science is that the information can be in any format — so not just achievement information. And, it can come from any source — not just from schools/kura or early years settings.

Data science is helping us in education. Consider the following examples:

  • e-asTTle — the NZ developed online learning and assessment tool that provides accessible information in the form of a graphical dashboard.
  • PaCT (progress and consistency tool) — aggregating data to support professional judgments in reading, writing, and mathematics.
  • Secondary tertiary transition app — data from schools and tertiary setting to see what young people do over time after leaving school.
  • Public Achievement Information (PAI) — collection of infographics that summarise a wide range of education topics.

How might we respond?

One gnarly challenge of education is how do we spread what works? How do we deeply understand the challenges we face in the education sector and learn from each other?  At the moment, we have a large number of closed systems — each school, each kura, and each early years centre is separate from others. There is no large source of data and information to draw on and learn from. Imagine if we saw education as an ecosystem (as described in Derek Wenmoth’s think piece on leadership), that was an open system with a multitude of tools, technologies, and platforms that worked together, just how much we could learn from each other. We could develop tools that would use data and information to identify what works where, when, and why, that could then support teachers with similar students.

We are starting to develop more open Communities of Learning | Kāhui Ako in which data and information from a range of sources such as school-based Student Management Systems (SMS), Learning Management Systems (LMS), health, social welfare, local community data can be shared, for the benefit of supporting individual young people to succeed.

The New Zealand Future Focus Forum identified four principles to help New Zealanders navigate the data future. These principles can be applied at a school, kura, or early years setting, or at a Community of Learning | Kāhui Ako setting.

  • Value: New Zealand should use data to drive economic and social value and create a competitive advantage.
  • Inclusion: All parts of New Zealand society should have the opportunity to benefit from data use.
  • Trust: Data management in New Zealand should build trust and confidence in our institutions.
  • Control: Individuals should have greater control over the use of their personal data.

At a national setting, education agencies, along with other social sector agencies, are using the protocols and data science to help people to know what to do, and to evaluate whether what they do is good enough. They have access to wide ranging data held in systems run by government that encompasses social, health, justice and other agencies. These same principles can help all educators, whether working in a closed setting or a more open one, to think about what we do with data and information and how we support our young people to think as data scientists.

Some questions to act as a stimulus with your colleagues

Being transparent in our use of data and information

Do educators have protocols, governance, and practices that support the use of data and information that:

  • supports all young people to experience education success and have choice in the pathways they choose to participate in life, education, and citizenship?
  • is inclusive — in what we collect or don’t collect and how we use it?
  • those who should benefit trust us to use the data and information in ways that will benefit them?
  • those who own the data and information can say what we can use and how?

Supporting our young people to have the capabilities to be data scientists

Do we design and orchestrate learning opportunities for young people that enable them to develop:

  • coding and computation skills for algorithms?
  • perspective taking, to want to understand problems from different points of view?
    critical inquiry, including curiosity and questioning about where the data and information has come from and whose voices are missing?
  • story-telling skills (visual, with data and with words, infographics) in ways that support others to take action?

Supporting our young people to be transparent in their use of data

  • Do we design and orchestrate learning opportunities for young people to develop protocols and practices that support the use of data and information that follow these four principles; value, inclusion, trust, and control?