The Chan Zuckerberg Initiative is set to change science after acquiring a Toronto start-up
Machine learning is quickly becoming a topic of general discussion, a topic that isn’t only reserved for data scientists and programmers, a topic that could very well become the next big thing.
It is the branch of artificial intelligence where computers and machines analyze large data sets and from which predict or provide relevant information, without being explicitly programmed to do so. With the increase of data and information in most fields of study, there is a challenge in organizing the data into a format that would provide results.
To better paint a picture of the vast amount of data we are dealing with, consider a phone book that gets updated every minute with phone numbers from individuals across the globe with no apparent order or pattern.
A single person or even a team of experts would require a long time organizing and predicting the pattern of how to organize future entries into the phone book. Machine learning would be implemented here, by creating a single line of code (an algorithm) to organize the current entries within the phone book, create an index and predict the pattern in which future entries would be entered into the phone book (i.e., alphabetical order).
Now consider this example of a large unmanageable data set, currently scientific knowledge published in papers and abstracts has become difficult to track and keep up to date. In the field of biomedicine alone, researchers publish up to 4,000 papers every day, although this represents a wealth of information it also presents a challenge in organizing all this information and allowing it to be tracked.
In comes Meta Inc., the Toronto start-up that utilized machine learning to sift and organize the information, ensuring that all scientific papers will be available to the scientific community allowing for increased access and sharing of information. Meta Inc., was recently acquired by the Chan-Zuckerberg Initiative and it has become an excellent example of the importance of machine learning and its implications for the future.
To provide some basic background on machine learning and to further explain how it works, we start from its two main types: supervised and unsupervised learning. In supervised learning, the system is presented with a labeled data set, containing inputs and expected outputs, in this case the system will learn the rule of determining outputs.
An example of supervised learning could be used to predict car prices, where the input values would be the year the car was manufactured and the output would be the price. Using this algorithm, the system would be able to predict future car prices.
Another relevant example of supervised learning has been seen in breast cancer tumour detection, where the system can be used to detect potential malignancy of the breast tumour based on size.
The second type of machine learning is unsupervised learning where the data has no labels, no structure and no apparent order or pattern. By using unsupervised learning, we are able to run an algorithm that would “cluster” data with similar entities and identify a structure or pattern.
An example of unsupervised learning would be a Google news search engine, thousands of news articles are updated and can be clustered or grouped into sections according to a specific topic (i.e., articles on Trumps inauguration).
Machine learning is a vast and innovative field, you can think of any large database or data repository that could be updated and organized through machine learning. Studying our galaxy, studying relationships between genes and diseases, financial modeling, predicting stock prices, cybersecurity, smart cars and legal robots are just a few of the different areas that machine learning has been used in and will be used in.
With the increase in data, it will become imperative to employ machine learning, as human tracking and record keeping can only go so far.
Splash image: Flickr