Start your day with intelligence. Get The OODA Daily Pulse.

Home > Analysis > New Capability From OpenAI Can Improve Anyone’s Ability To Analyze Data

This post examines a new capability by OpenAI called Code Interpreter. It walks through a demo of how to use this new capability and in doing show gives an example of why this could be so empowering to any seeking to analyze data for themselves.

I highlight the use of this new OpenAI capability by starting with a question I have had personal theories and beliefs on for years. But I have wondered what the data would say. I wonder, does a full moon really have an effect on human or animal behavior?

Versions of stories about full moon behaviors vary widely. Personally I have heard stories of emergency rooms ensuring manning is greater on full moons and police departments ensuring more police are on the street on these days. Another frequently heard story is that animals are more likely to go wild under a full moon. Some believe there are more car accidents when the moon is full. 

If our understanding of the four fundamental forces of nature (Gravity, Electromagnetism, Atomic Strong Force and Weak Force) are correct, there is really no way for full moons to have any real physical impact on behavior. But, if humans and animals evolved over millions of years under a moon that is sometimes bright, perhaps there is something innate in us that causes different behaviors when the moon is full. 

This paper captures the results of a data analysis to examine this topic. Every full moon date is known. Many other potential data sources exist to test against this. We selected one on animal bites. If full moons have an impact on animal or human behavior, we would expect to see more animal bite during a full moon.

Methods

Two datasets were utilized in this study: animal bite data and full moon dates, both available via the online community Kaggle. A new tool developed by OpenAI, the Code Interpreter, was utilized to conduct data extraction, preparation, and analysis. Using the Code Interpreter, we calculated the total number of bites that occurred on full moon days and non-full moon days, as well as the average number of bites per day for both categories.

A process note: Extracting, refining and preparing data for analysis can be hard and time consuming. And using multiple data sets to seek correlation is also challenging. For non-developers like myself it is such a huge turnoff that I don’t spend any timed exploring questions like the one I posed here. There is little reason to examine frivolous but interesting topics that are not related to business. As a citizen scientist I leave the advancement of science to those who are getting paid for it!

However, something just changed in my ability to analyze data. OpenAI just released a new capability called Code interpreter. This is a version of ChatGPT that knows how to write and execute software code, and can work with file uploads. This means that ChatGPT can be asked to help with data analysis and producing code to conduct that analysis, as well as create visualizations of the results. 

Here is how the session went. 

First, I downloaded data I thought most relevant. The data is in the public domain, with both datasets available via the online community Kaggle (see below). 

Using chat.openai.com I went to settings and enabled The beta feature called “Code Interpreter” 

A screenshot of a computer

Description automatically generated

From there, start a new chat.  The GPT-4 model needs to be selected and the Code Interpreter beta needs to be activated. 

A screenshot of a chat

Description automatically generated

Now the entry chat window allows files to be uploaded. I uploaded my two csv datasets and started asking questions. 

A screenshot of a chat

Description automatically generated

I then uploaded a zip file with the animal bite data. 

A screenshot of a computer

Description automatically generated

The earliest date in the animal bites dataset is 2010-01-01, and the latest date is 2018-04-17.

OpenAI helped me understand the data in the bite file, including spotting data where there were obvious errors. For example, a date entry of 5013-07-15. 

The system then helped me analyze data, and did not need any input from me in doing so. It explained what it was doing step by step. It calculated how many bites occurred on full moon days and non-full moon days. And it calculated the proportion of both relative to the total.

Results 

There is no statistically relevant difference. The moon may be a harsh mistress but its impact on us does not vary when full or not. 

Out of the total 8618 reported bites from 2010-01-01 to 2018-04-17:

  • 278 bites occurred on full moon days.
  • 8340 bites occurred on non-full moon days.

This means that approximately 3.2% of all bites occurred on full moon days.

The average number of bites per day during the entire period from 2010-01-01 to 2018-04-17 is approximately:

  • 2.73 on full moon days
  • 2.85 on non-full moon days

This suggests that there’s a slightly higher average number of bites on non-full moon days compared to full moon days, but this slight difference is not statistically relevant. 

Let’s visualize these averages for a clearer comparison.

A blue and orange squares

Description automatically generated

The bar plot visually represents the average number of animal bites per day on full moon days versus non-full moon days from January 1, 2010 to April 17, 2018. As we can see, the average number of bites is only slightly higher on non-full moon days compared to full moon days.

This suggests that there is no significant increase in the number of animal bites during full moons for the entire period studied, and the reverse is also true, the tiny increase on non full moon days is not statistically relevant.

Discussion 

As a check of statistical relevance, I asked ChatGPT to review data using a two-sample-t-test. 

The null hypothesis for this test is that the average number of bites on full moon days is the same as the average number of bites on non-full moon days.

A t-test will provide a p-value, which is the probability of observing a difference as large as what was observed (or larger) if the null hypothesis is true. A small p-value (typically, less than 0.05) indicates strong evidence against the null hypothesis, so we would reject the null hypothesis and conclude there is a difference. This test assumes the data is normally distributed and the variances are equal. These assumptions might not be completely valid for our data, but the t-test is fairly robust to violations of these assumptions when sample sizes are large, as they are in our case. The t-test statistic is approximately -1.27 and the p-value is approximately 0.20. A p-value of 0.20 is larger than the commonly used significance level of 0.05, which means we do not have enough evidence to reject the null hypothesis. In other words, we cannot conclude that there is a statistically significant difference between the average number of bites on full moon days and non-full moon days.

This statistical result supports our earlier observations and visualizations that suggested there is no significant correlation between full moons and an increased number of animal bites.

This is a very basic analysis, but it shows a few things that could help advance our scientific understanding of the world. As much as we love the moon, a full moon does not seem to change human or animal behavior in any statistically significant way. If experience leads to conclusions that human or animal behavior are different in some way, it could actually be other factors at play or perhaps just humans acting on myths. The difference in average daily bite incidents between full moon and non-full moon days was found to be statistically insignificant, as determined by a two-sample t-test (p-value = 0.20). This suggests that the belief of increased animal bites during a full moon is likely a myth, and any perceived increase might be attributed to other factors or merely coincidence.

Conclusion 

The analysis included 8618 reported bite incidents from January 1, 2010, to April 17, 2018. Of these, 278 bites (3.2% of total) occurred on full moon days, while the remaining bites occurred on non-full moon days. The average number of bites per day was 2.73 on full moon days and 2.85 on non-full moon days.

Thanks to the new Code Interpreter capability of OpenAI I was able to conduct this analysis in a matter of minutes. Then in just under an hour I was able to produce both this blog post and a format of results suitable for publishing in a scientific journal.

For a link to the paper published for community feedback and see: Advancing Science With Artificial Intelligence; An Example Using A Quantitative Analysis of Full Moon Influence on Human and Animal Behavior

Recommendation

The age of AI is here. We would all be well served to continue to learn these new technologies and seek ways to apply them to our daily lives.

Bob Gourley

About the Author

Bob Gourley

Bob Gourley is an experienced Chief Technology Officer (CTO), Board Qualified Technical Executive (QTE), author and entrepreneur with extensive past performance in enterprise IT, corporate cybersecurity and data analytics. CTO of OODA LLC, a unique team of international experts which provide board advisory and cybersecurity consulting services. OODA publishes OODALoop.com. Bob has been an advisor to dozens of successful high tech startups and has conducted enterprise cybersecurity assessments for businesses in multiple sectors of the economy. He was a career Naval Intelligence Officer and is the former CTO of the Defense Intelligence Agency.