The American Flag // Lorenzo Yario's Website

Generative AI usage, when not immediately apparent, will be disclosed with a robot 🤖 emoji.

For a while, it felt like the United States flag was being flown at half-mast almost too often. Of course, half-mast is an important declaration—to honor the fallen, to recognize tragedies, to respect others, but then there becomes a question of when it is too much? When is it a boy who cried wolf situation? My first step was to determine when the flag was flown at half-mast, and I had a hypothesis that the flag is being flown in this manner more often than it was 15 years ago. Where could I find the data? I decided to go to ChatGPT to see if it knew of a website that might already have the dataset I was looking for. As expected, it did not provide me with any helpful information—only why the flag is flown this way. A quick Google search also did not yield any useful information on where to find this data, only showing news stories about recent instances when the flag was flown in that manner.

I then looked on DuckDuckGo, but it continued to not display any good results (Not many results contain mast, usa. Search only for half "mast" data flag "usa”?), BUT I did come across a website called flagwatch.net, which appeared to be a simple site that would utilize the White House RSS feed. This website appeared to use a different method for determining whether to fly a flag at half-mast or not. Their description stated, “This site uses the White House RSS feed to find Presidential Orders ordering flags to half-staff. It then parses the order with an LLM to generate this webpage. You are welcome to scrape the site directly or use the JSON feed.” So, ChatGPT or a similar tool can determine if it was a proclamation to fly a flag at half-mast? Perhaps. That appeared to be my best bet. Finally, I found my smoking gun: https://www.federalregister.gov/presidential-documents/proclamations. This government website displays all presidential proclamations since 1994, providing 31 years of data for this experiment.

My plan: Scrape all proclamations and create a MySQL database with these fields:


id (primary key),

proclamation_number,

date,

president,

text

To accomplish this, I first asked AI, which generated absolutely terrible code that did not achieve my desired outcome and was extremely convoluted. Eventually, 🤖 generative AI did help me generate a CSV once I gave it very clear instructions.

That CSV was only really a part one, as it only showed me which pages I'd need to scrape. After a quick couple of hours, I came back with a .sql file of every single proclamation since early Clinton.

Once I got the file imported into MySQL, I began querying.



mysql> SELECT * FROM proclamations_full LIMIT 1;

Now, much of this data will not be particularly relevant to us at the moment. So the next thing I did was to count the number of times either “half-staff” or “half-mast” appeared in this dataset.

mysql> SELECT COUNT(*) FROM proclamations_full WHERE `body_text` LIKE '%half-mast%' OR `body_text` LIKE '%half-staff%'; +----------+ | COUNT(*) | +----------+ | 243 | +----------+ 1 row in set (0.45 sec)

Now what? Well, all 243 are probably important. But how long were flags at half-mast each time?

Can we just take the signing date? Nope. Some proclamations order flags to be flown at half-mast for multiple days. So blindly pulling dates doesn’t work.

I decided on a hybrid case. Here’s my thought process:

Use SQL + regex to pull any text like “half-mast for 30 days” or specific date ranges
For edge cases, manually review the proclamation
Use generative AI 🤖 to analyze and extract structured date data (start, end, exceptions)

Upon doing that, it occurred to me that these proclamations are written differently, from Clinton's to Trump's second term. Maybe the flagwatch.net person was right? I decided then to take that route and use AI to help extract mentions of which days the flag would be flown at half-mast. It ended up generating a relatively good CSV file for me to use, well, one that got appended to the existing CSV. GPT-5 mostly followed the proper format in the prompt I gave it:

“You are to determine what days from each proclamation the US flag is to be flown at half-mast. ONLY output the date in a JSON format in MM/DD/YYYY with start_date, end_date, and exceptions. Exceptions are for if there is a day within a timerange in which the flag would be flown at full-staff.” The actual python code gave me a bunch of headaches due to how many conversions and other things I had to do just to get the code running and not display an error (analyze_this.py).

Let's dive into the findings.

The findings

The amount of days per year as a bar graph that the flag is half-mast. — It's not quite the same.

After fanangling with the code, I was able to get this raw output to display: The amount of days each year the flag was flown at half-mast:


        1994    38

1995    16

1996    22

1997     5

1998    10

1999     8

2000     9

2001    23

2002     7

2003    13

2004    33

2005    45

2006    13

2007    36

2008     5

2009    14

2010    15

2011    11

2012    31

2013    22

2014     4

2015    25

2016    43

2017    15

2018    60

2019    24

2020    14

2021    63

2022    46

2023    31

2024     7

2025    40

Name: count, dtype: int64

Above you can see the same data both in bar graph and text form. It does appear that the flag is flown at half-mast more often than before, but maybe a linear regression model would be good to prove this.

The same graph from above, but with a linear regression model applied. — As we can see, the flag is being flown at half-mast more than before.

I consider my hypothesis to be true: The flag is indeed flown at half-mast more than it used to, and the linear regression model shows that well. Hopefully, it does not become a boy who cried wolf scenario in the future.

My key takeaways from studying this data were that it was a way to study using Python, numpy, pandas, and other things more, since those are not a strong spot for me compared to other analysis software. If I were to do this again, I would likely have stayed in MySQL as much as possible, as that is a strong suit for me; I know queries well.
Another takeaway is to not always take things for granted that the data you want is just sitting there available for you, as many times it is not.