A “Hot Coffee” Warning Label For AI/ML Users and “Deep Learning” – things are not always what they always appear to be.

Where to start with this one. Experienced MLers, hopefully you can simply disregard this article as you should know the ramifications of your work. As such I do not profess the talent of many coders I’ve had the pleasure of encountering online. There are thousands of wonderful and fascinating projects a person can undertake and rebuild as their own online. Who in their right mind would pass up an opportunity to make employment decisions or to “[Use] Machine Learning to Predict Stock Prices“? If only it were so easy to choose employment candidates or make money. The reality is decision making is much more complex than many in computer science are willing to admit. Good candidates will be missed and monies lost in the absence of human checks and balances. In the case of the former an employer could find themselves in a discrimination suit and the later financial ruin. This article is meant to serve as a simple reminder to use caution when using code you do not understand or to leverage technology recklessly. … and this article is by far very simple when considering the complexity of coding and the talents of the AI/ML coders for where things can go wrong. Employers and contractors of coders should also take warning – the code you’ve paid for may not achieve the goals you desire and may in fact make you liable for their results. Buyer beware, know the limits of your request. Programmer beware, know the limits of your code – I anticipate a day of reckoning will one day be had.

Sophia the Robot on SingularityNET’s Phase 2 Initiative and SophiaDAO

Like a lot of technology professionals, the pandemic has presented itself with learning opportunities and a chance to explore into associated fields of study. Most of us are fascinated with the thoughts and possibilities that AI can add to our professions and lives in general. AI is touted as being able to save thousands of lives. Who hasn’t been mesmerized with DeepMind’s AlphaGo, Tesla’s “autopilot AI” and Hanson’s Sophia? If you’re like me in your search for understanding it’s easy to jump into “code-land” in an effort to gain some understanding to what programming structure is like and try some down to Earth practical examples. It’s in this search that it would pay to find a quality educational source (even if it is open source). This is a practical caution I offer as not all published examples are credible in their understanding of how AI should be trained, tested and applied. I’m not exactly certain the idea of using AI in a social-economic context is practical as described in the SophiaDAO SingularityNET effort in expanding the consciousness of mankind – it seems a little awkwardly stretched at times. As coding grows and becomes more accessible a measure of accountability and responsibility will become ever more important.

An example of this “air to caution” can be found in online publications like Medium, which otherwise has some awesome content. With code being offered freely by authors it would pay for the prospective coder to validate who the code is from and whether or not it is offering a reasonable analysis of the data. Medium appears to host student content, which may be incomplete or the end goals poorly thought out. This particular caution comes from the programmers understanding of how the statistical routines are being applied, how data is treated and ultimately analyzed and reported . There are other nuances, that while an otherwise competent coder which are more of an analytic nature beyond the ability of the original coder themselves. After all, analyzing Fintech data vs. data from CERN’s Large Hadron Collider would require different treatments and skill sets. Of interest, CERN notes on their website that one billion collisions per second generates one petabyte per second of data – a data scientist nightmare with or without a background in physics. So like most things found on the internet, treat the code that’s claiming to make you a millionaire with a grain of salt. I have found several examples from online sources where the intended purpose of the code was flawed in one way or another. Something as simple as not biasing the outcome by using the same data set for training and testing is huge in affecting the outcome of forward looking statistical projections. Kuhn and Johnson’s text published in 2013 “Applied Predictive Modeling” point out:

Ideally, the model should be evaluated on samples that were not used to build or fine-tune the model, so that they provide an unbiased sense of model effectiveness. When a large amount of data is at hand, a set of samples can be set aside to evaluate the final model. The “training” data set is the general term for the samples used to create the model, while the “test” or “validation” data set is used to qualify performance.Max Kuhn and Kjell Johnson, Page 67, Applied Predictive Modeling, 2013

This and other points were reiterated in the online article “What is the Difference Between Test and Validation Datasets?” by Jason Brownlee. So these ideas are nothing new, they merely are being overlooked or miss applied in the rush to get some results and/or be published. This practice is further corrupted when the programmer mistakenly generates statistics such as correlation values, standard deviations, etc all on biased data sets. Often this can be discovered in the troubleshooting process when live results don’t match those built by the model. This is never more evident when analyzing stock market data where there is a tremendous amount of noise when looking at data on short time frames.

With so much on the line, coders need to be astute in getting their code peer reviewed before publishing erroneous code sets that could, one day, cause more harm than good. This will become evermore important with the advent of GitHub’s OpenAI project Copilot. While excited about the prospects of having AI assisting in coding, many have reservations regarding the code Copilot provides. It would seem, at first anyway, that the prospects for propagating bad code can and will happen.

Much of this could be avoided if the coding world if potential coders understood the potential pitfalls that can be encountered in data analysis. Erring to caution of our own potential shortcomings, perhaps as coders we should put disclaimers. Nigel Duffy’s article “How machine learning projects go wrong” and Sandeep Uttamchandani’s “98 things that can go wrong in an ML project” highlight some considerations of a projects potential failure. If you are a practical programmer, and learn from doing projects, look to see if the data set is divided up into training and test sets before proceeding with the code would be a good starting point BEFORE rebuilding a project that is froth with errors and only reinforcing bad coding. Please keep in mind this article has focused on the coding aspects of AI/ML and projects and hinted at the treatment of data – there are a host of other issues to be considered as well. Coders will soon find themselves dealing with ethical treatments as AI/ML makes decisions which effect the lives of everyday people. Where does the responsibility of bad code lay? From the programmer perspective the unintended use of code or mis-used of code found openly on the internet ultimately lay with the end-user, but I doubt this will stop there. Just as with an unlabeled hot cup of coffee coders may someday find themselves innocently in court over their test “hello world” code.

In part with this knowledge it’s easy to understand why there aren’t more AI millionaires trading the stock market. Should I become a millionaire and open a coffee shop in some far off exotic location like Bali, mention this article and receive your free cup of coffee with the purchase of a pastry – complete with an appropriate warning label. I’ll be keeping the winning code to myself.

Leave a Reply

Your email address will not be published. Required fields are marked *