Daylight Saving Time: On modelling and robustness

In a previous post I looked at the daily rhythm of r/counting, at what time of day the subreddit is most active, and how that has changed throughout the years. In this post I will try and answer the question that’s on nobody’s lips: Is it possible to see the effect of daylight saving time on the r/counting schedule?

Reddit stores the UTC timestamp for every count, so it’s possible to compare the counting activity just before the start of DST with the counting activity just afterwards, and see whether there’s a difference. If counts always follow the same pattern in local time, and all counters observe DST at the same time, then that should show up as a rigid shift in the data.

That’s the theory at least. As a spoiler, I’m doing something a bit different in this post, so it’s going to seem a lot more open-ended and exploratory than some of the other ones. When I started coding and writing I didn’t entirely know where I was going to end up, so you can think of this as going on a journey with me. Apart from answering the question, my goal with this post is to show you my general approach for dealing with these kinds of questions, and to highlight some of my thoughts along the way.

Before I can get started, there’s one more thing to get out of the way, namely the fact that DST occurs at different times (if at all) throughout the world. In this post, I’ve focussed exclusively on DST in the US & Canada¹, where it starts on the second Sunday in March and ends on the first Sunday of November of each year².

¹ Apart from Hawaii and Arizona, which are weird

² That hasn’t always been the DST rule, but it’s been the case for as long as r/c has existed

I’ll start off with some code to import the relevant packages and load the data.

Code for importing packages and loading data

import os
import sqlite3
from calendar import timegm
from datetime import datetime, timedelta
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
from IPython.display import Markdown
from rcounting import analysis, plots, units
from rcounting.counters import apply_alias

pio.templates.default = "seaborn"

data_directory = Path(os.getenv("COUNTING_DATA"))
db = sqlite3.connect(data_directory / "counting.sqlite")

A first model

To see the effect of DST, I can compare the time of day plots for the weeks just before and just after DST is introduced, and see if there are any obvious differences. To maximise the effect of I’ll focus only on counts that occurred Monday to Friday, since people’s weekday schedules should generally be more regular than their weekends.

To start with, I’m going to need some code to find when DST starts and ends on any given year

Finding the start of DST for every year

days = ["monday", "tuesday", "wednesday", "thursday", "friday", "saturday", "sunday"]
def find_nth_weekday(year, month, weekday, n):
    d = datetime(year, month, 1 + 7 * (n - 1))
    offset = (days.index(weekday.lower()) - d.weekday()) % 7
    return d + timedelta(offset) + timedelta(hours=2)

years = range(2012, 2024)
dst_start = {
    year: timegm(find_nth_weekday(year, 3, "sunday", 2).timetuple()) for year in years
}

dst_end = {
    year: timegm(find_nth_weekday(year, 11, "sunday", 1).timetuple()) for year in years
}

I’ll also define a bunch of constants that I’m going to need later.

Defining constants for later use

WEEK = 7 * 24 * units.HOUR
OFFSET = 5 * units.HOUR
BIN_WIDTH = 30
nbins = int(units.DAY / BIN_WIDTH)
minval, maxval = -3, 3
BIN_TO_MINUTE = (BIN_WIDTH / 60)
time_axis = np.linspace(0, units.DAY, nbins, endpoint=False) + BIN_WIDTH / 2
dy = 0.07

And finally I’ll define some code for getting the raw data into slightly more manageable shape, and to calculate a counting rate over time from the timestamps of each count.

Code for getting the raw data into slightly more manageable shape.

def wrangle(data, week_map=dst_start):
    data = data.reset_index(drop=True)
    data["username"] = data["username"].apply(apply_alias)
    data["timestamp"] = data["timestamp"] - OFFSET
    data["date"] = pd.to_datetime(data["timestamp"], unit="s")
    data['time'] = (data['timestamp']) % units.DAY
    data["week"] = (np.floor((data["timestamp"]
                              - pd.to_datetime(data["timestamp"], unit="s").dt.year.map(week_map))
                             / WEEK)
                    .astype(int))
    data["year"] = data["date"].dt.year
    return data

def generate_kdes(df):
    n_weeks = len(df["week"].unique())
    kdes = (df
            .groupby("week")["time"]
            .apply(lambda x: pd.Series(analysis.fft_kde(
                x,
                nbins,
                kernel="normal_distribution",
                sigma=0.02)[1] * nbins))
            .reset_index()[["week", "time"]])
    kdes["time_axis"] = np.hstack(n_weeks*[time_axis])
    kdes.columns = ["week", "rate", "time"]
    kdes["time_date"] = pd.to_datetime(kdes["time"], unit="s")
    return kdes

With that out of the way, I’ll select the weeks just before the start of DST and the weeks just after the end of DST for every year in r/counting’s history. Then, I can plot the distribution of counts throughout the day for the week before and the week after the introduction of DST, and see how they differ:

Code to plot the DST and DST - 1 distributions

opacity = 0.8
fillcolors = [
    "rgba" + x[3:-1] + f", {opacity})"
    for x in pio.templates[pio.templates.default].layout.colorway
]


def query(x):
    return (
        f"select timestamp, username from comments where timestamp between "
        f"{x + OFFSET + minval * WEEK} and {x + OFFSET + maxval * WEEK} "
        f"order by timestamp"
    )


spring_all = wrangle(pd.concat([pd.read_sql(query(x), db) for x in dst_start.values()]))


def mask(df):
    return (df["date"].dt.weekday < 5) & (-2 <= df["week"]) & (df["week"] < 2)


spring = spring_all[mask(spring_all)].copy()
spring_kdes = generate_kdes(spring)
week_map = {
    -2: "Control without DST",
    -1: "Without DST",
    0: "With DST",
    1: "Control with DST",
}
spring_kdes["week_name"] = spring_kdes["week"].map(week_map)


def kde_plot(df, weeks):
    fig = go.Figure()
    for week, fillcolor in zip(weeks, fillcolors):
        frame = df.query("week_name == @week")
        fig.add_trace(
            go.Scatter(
                x=frame["time_date"],
                y=frame["rate"],
                fill="tozeroy",
                name=week,
                mode="lines",
                fillcolor=fillcolor,
            )
        )
    fig.update_xaxes(tickformat="%H:%M", title="Time of Day (UTC - 5)")
    fig.update_yaxes(title="Counting rate (arbitrary units)")
    fig.update_layout(legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01))
    return fig

fig = kde_plot(spring_kdes, ["Without DST", "With DST"])
fig.show()

The shape of the two plots is similar, and it looks like the plot with DST is generally leading the one without, as would be expected if one was just a rigid shift of the other. But it’s certainly not a perfect match, and it’s hard to see from the curves just how much the DST curve is leading.

I can try and see what the optimal shift of the DST curve would be to get it to match the curve without DST.

Code to calculate and plot DST and DST -1 overlaps

def calculate_shifted_overlap(df, week1, week2):
    fixed = df.loc[df["week_name"] == week2, "rate"].to_numpy()
    rotating = df.loc[df["week_name"] == week1, "rate"].to_numpy()
    norm = np.trapz(fixed * rotating, x=time_axis)
    shifts = [
        np.trapz(fixed * np.roll(rotating, i), x=time_axis) / norm
        for i in range(len(fixed))
    ]
    optimal_shift = (np.argmax(shifts) + nbins / 2) % nbins - nbins / 2
    return shifts, optimal_shift


shifts, optimal_shift = calculate_shifted_overlap(
    spring_kdes, "With DST", "Without DST"
)

print(f"The optimal shift is {int(optimal_shift * BIN_TO_MINUTE)} minutes.")

fig = px.line(x=pd.to_datetime(time_axis, unit="s"), y=shifts)
fig.update_xaxes(tickformat="%H:%M", title="Shift (hours)")
fig.update_yaxes(title="Similarity score")
fig.show()

The optimal shift is 45 minutes.

That’s a bit less than one hour, but it’s still suggestive. Apparently it is possible to use the counting data to determine whether or not DST is currently active.

So, case closed, right?

Validating the model

Not so fast.

It could be that there’s a shift of one hour every week and DST has nothing to do with it! More seriously, there are other changes happening throughout the time period apart from DST; in the spring the days are getting longer, particularly the evenings, and maybe that’s what’s driving the change. And I haven’t at all looked at what happens when the clocks go back.

Adding more weeks

Let’s start by looking at what happens before DST is active. For the preceding analysis to be valid, the distribution of counts throughout the day would need to be basically the same in the two weeks before the start of DST.

Code to plot the DST - 1 and DST - 2 distributions

fig = kde_plot(spring_kdes, ["Without DST", "Control without DST"])
fig.show()

Hm. Those two curves might be slightly more aligned than the two with and without DST, but it’s not super clear. I can check the optimal shift

Code

_, optimal_shift = calculate_shifted_overlap(spring_kdes, "Without DST", "Control without DST")
print(f"The optimal shift is {int(optimal_shift * BIN_TO_MINUTE)} minutes.")

The optimal shift is 98 minutes.

That’s an even bigger shift than the one that happened when DST was introduced! I can plot the four curves for the two weeks before and after DST together and see if there’s any obvious pattern.

Code

def plot(df):
    fig = px.line(
        df,
        x="time_date",
        y="rate",
        color="week_name",
        labels={"week_name": "Week", "time_date": "Time"},
    )
    fig.update_yaxes(title="Counting rate (arbitrary units)")
    fig.update_xaxes(tickformat="%H:%M", title="Time of Day (UTC - 5)")
    return fig
fig = plot(spring_kdes)
fig.show()

The plot is very busy but you can show or hide individual traces by clicking on the legend. If you didn’t have the legend, would you be able to tell which two of these curves were with DST and which were without? It seems that the variation from week to week is so big that any DST signal that might be present in the data is just swamped by all the noise.

Including the end of DST

I can try and see if including the data for the end of DST makes any difference

Code

autumn_all = wrangle(pd.concat([pd.read_sql(query(x), db) for x in dst_end.values()]), dst_end)
autumn = autumn_all[mask(autumn_all)].copy()
autumn["week"] = -1 - autumn_all["week"]
kdes = generate_kdes(pd.concat([spring, autumn]))
kdes["week_name"] = kdes["week"].map(week_map)
fig = plot(kdes)
fig.show()
_, optimal_shift = calculate_shifted_overlap(kdes, "With DST", "Without DST")
print(f"The optimal shift is {int(optimal_shift * BIN_TO_MINUTE)} minutes.")

Figure 1: The aggregated activity on r/counting in the two weeks on either side of the start/end of DST.

The optimal shift is 17 minutes.

As before – would you be able to tell which of these graphs were with DST and which were without if you didn’t have the legend?

Summing up

The validation of the model has revealed that the activity on r/counting varies enough on a week to week basis that my initial assumptions are incorrect, and I can’t just treat the activity as a constant background with a DST signal on top. If I want to see the effect of DST, I’m going to have to come up with something more clever.

More Advanced Models

Disaggregating the years

What I did in the previous section was to aggregate the activity on r/counting across all the years it’s been active. After that, I honed in on specific weeks near the time of year when the clocks change, and asked if there was a rigid shift in the data.

This analysis revealed that the activity on r/counting isn’t stable over time. Maybe I’m losing information by aggregating all the years, and the signal would be clearer if I looked at each year separately.

Before I can make these comparisons I’m going to need a way of boiling down the information. Figure 1 and friends in the previous section showed that spotting the shift by eye is very difficult, and if the plot is further split into a new line for each year, it’s going to become completely unreadable.

What I need is a way of compressing each (week, year) pair to a single point, so that the plots are still legible even after disaggregating the years.

I can use the fac t that the DST offset is exactly one hour to accomplish just this: For each week, I can calculate how much the distribution resembles that of the week before, and I can also calculate how much the distribution resembles the 1 hour shifted distribution from the week before.

For most of the year, it should be the case that the unshifted distribution is more similar then the shifted distribution. But, for the week where the clocks change, the shifted distribution should be more similar. So, I can calculate the similarity of the lagged and shifted distribution, and subtract the similarity of just the lagged distribution, and I have a DST fingerprint. For most weeks, it should give a negative value, but for the week where the clocks change it should give a positive value.

Let’s see how it goes!

Code

def dst_fingerprint(df, period="spring"):
    """Calculate the dst fingerprint for a single year"""
    transitions = dst_start if period == "spring" else dst_end
    x = df.resample("300s", on="date").size()
    rates = x.div(x.groupby(pd.Grouper(freq="1d")).transform("sum")).to_frame(
        name="rate"
    )
    rates.index = rates.index - pd.to_datetime(
        rates.index.year.map(transitions), unit="s"
    )
    shifted = rates.shift(freq="7d")
    shift = "-1h" if period == "spring" else "1h"
    dst_shifted = shifted.shift(freq=shift)

    dfs = []

    for df in [shifted, dst_shifted]:
        background = pd.Series(
            (maxval - minval) * [np.nan], range(minval, maxval), name="delta"
        )
        background.index.name = "date"
        f1 = pd.merge(rates, df, left_index=True, right_index=True)
        if len(f1) != 0:
            f1["delta"] = (f1["rate_x"] - f1["rate_y"]) ** 2
            series = f1.groupby(f1.index.days // 7)["delta"].sum()
            background.loc[series.index] = series
        dfs.append(background)

    return dfs[1] - dfs[0]


def multiple_dst_fingerprints(df, period="spring"):
    groups = df.groupby("year").apply(dst_fingerprint, period=period)
    return groups.reset_index().melt(id_vars="year")


def scatterplot(df):
    total = (
        df.set_index(["year", "date"])["value"].mul(year_norm).groupby(level=1).sum()
    )
    fig = px.scatter(
        df,
        x="date",
        y="value",
        color="year",
        color_continuous_scale="plasma",
        labels={"value": "DST fingerprint", "year": "Year", "date": "Week"},
    )
    line = go.Scatter(
        x=total.index, y=total, line=dict(color="hsl(0, 0%, 40%)"), mode="lines"
    )
    fig.add_trace(line)
    fig.add_hline(y=0, line_width=2, line_dash="dash", line_color="hsl(0, 0%, 20%)")

    fig.update_xaxes(title="Weeks after start of DST")
    fig.update_yaxes(title="DST fingerprint")
    fig.update_coloraxes(showscale=False)
    fig.update(layout_showlegend=False)
    return fig


week_norm = (
    spring_all.groupby(["year", "username"]).size() / spring_all.groupby("year").size()
)
year_norm = spring_all.groupby("year").size() / len(spring_all)

df = multiple_dst_fingerprints(spring_all).dropna()

fig = scatterplot(df)
fig.show()

Hm. This isn’t very promising. The DST signal should show up in this plot in the fact that the points at 0 should lie significantly higher than all the others. That’s not the case at all.

I can do the same thing for when DST ends, just for good measure, to see if the signal shows up there:

Code

df = multiple_dst_fingerprints(autumn_all, "autumn").dropna()
fig = scatterplot(df)
fig.update_xaxes(title="Weeks after end of DST")
fig.show()

Unfortunately, that didn’t seem to show the signal either. Before giving up completely and abandoning this as a fool’s errand, there’s one or two more things I can try.

Disaggregating the different counters

Regulars of r/counting will know that it’s not the same people who count every week. If proof of this is needed, you can take a look at the top weekly counters list and see that it really isn’t just a repeat from week to week. Perhaps this is one cause of the lack of pattern in the counting times. It’s certainly possible to imagine a world where counters are perfectly regular, but the different schedules of different counters coupled with their different activity from week to week adds up to a huge mess.

So I can keep going with the disaggregation, and see if I get a clearer signal when we compare the activity of individual counters from week to week.

Code

def username_fingerprint(df, period="spring"):
    fingerprint = (df.groupby(["year", "username"])
                   .apply(dst_fingerprint, period=period)
                   .reset_index()
                   .melt(id_vars=["year", "username"], var_name="date")
                   .set_index(["year", "username", "date"])["value"])
    return fingerprint

fingerprints = username_fingerprint(spring_all)
df = (
    fingerprints.dropna()
    .mul(week_norm)
    .groupby(level=[0, 2])
    .sum()
    .reset_index(name="value")
)
fig = scatterplot(df)
fig.show()

Looking at this graph almost makes me think I have a sign error in the way I’ve defined the DST fingerprint. I’ve double checked, and I don’t think it’s the case, but this certainly isn’t the peak at 0 I was hoping to see.

Looking only at the most regular counters

None of what I’ve tried so far has seemed to work. There’s one last thing I can try: I can find out which counters were most regular in the period leading up to the start of DST each year, and only include them in the calculations

To do that I’ll need to slightly rework the code from Section 1 for calculating the overlap between two different counting distributions. This will let me calculate the overlap for every counter for every year in the weeks around the onset of DST.

Then it’s just a bit of fidding with indices to find the 5 most regular counters two weeks before the start of DST for every year.

Code

def similarity_score(df):
    kdes = generate_kdes(df)
    groups = kdes.groupby("week")["rate"]
    norm = groups.transform(np.linalg.norm)
    kdes["rate"] /= norm
    overlaps = (
        (kdes.set_index(["week", "time"]).groupby("time")["rate"].diff() ** 2)
        .groupby(level=0)
        .sum()
    )
    return 1 - overlaps / 2

scores = spring_all.groupby(["year", "username"]).apply(similarity_score)
scores = scores[scores != 1]
counters = (
    scores
    .reset_index()
    .query("week== -2")
    .sort_values(["year", "rate"], ascending=False)
    .groupby("year")
    .head(5)
    .set_index(["year", "username"])
    .index
)
subset = spring_all.set_index(["year", "username"]).loc[counters]

With that out of the way, I can plot the average similarity score for each year and week

Code

week_norm = subset.groupby(["year", "username"]).size() / subset.groupby("year").size()
year_norm = subset.groupby("year").size() / len(subset)

similarity = (
    scores.reset_index(level=2)
    .loc[counters]
    .set_index("week", append=True)["rate"]
    .mul(week_norm)
    .groupby(level=[0, 2])
    .sum()
    .reset_index(name="value")
)

fig = scatterplot(df)
fig.update_yaxes(title="Week consistency score")
fig.show()

I’ve decided to work with the consistency score rather than the DST fingerprint³, because I wanted to highlight the effect of the selection I’ve made. You can see on the graph that the score two weeks before the start of DST is signficantly higher than all the other weeks, and in particular the least consistent year for week -2 is much more consistent tht the least consistent year for all the other weeks.

³ The two scores behave similarly, with the exception that for the consistency score, we’d expected DST to show up as a dip at zero, rather than a peak.

This apparent result is just an artefact of the way I’ve selected the data: I’ve limited my search to people who were very consistent two weeks before the onset of DST, so it’s no surprise that the consistency is high here. The fact that the consistency is lower in the following weeks is due to a regression towards the mean.

That statistical artefact aside, it doesn’t seem that this analysis has brought me any closer to finding a clear sign of DST in the counting data. With the consistency score, the sign of DST is a dip at 0, so the fact that the least consistent week is the week DST starts is suggestive. But it’s not what I’d call proof.

Conclusion

If you want to find out whether or not the US currently has DST, then looking at the comments on r/counting is not a viable method for doing so. I would suggest just googling it instead.

This post ended up being much longer than expected (and a fair bit longer than the reddit comment that it’s based on), mainly because I’ve had to change the conclusion along the way.

In the original, and in my first draft, I wasn’t as thorough with my robustness analysis as I’ve been here. That meant that I was more convinced by the hints of a DST signal in the data, and the conclusion reflected that. Unfortunately, this post has demonstrated that it just isn’t there. On the positive side, the post has also demonstrated the value of checking assumptions, validating any model that you might come up with, and generally having a healthy dose of skepticism towards any new discoveries – especially your own.

And that’s perhaps as good a place as any to end.

Until next time!

--- title: "Daylight Saving Time: On modelling and robustness" --- In a [previous post](https://cutonbuminband.github.io/counting-analysis/time.html) I looked at the daily rhythm of [r/counting](http://www.reddit.com/r/counting), at what time of day the subreddit is most active, and how that has changed throughout the years. In this post I will try and answer the question that's on nobody's lips: Is it possible to see the effect of daylight saving time on the r/counting schedule? Reddit stores the UTC timestamp for every count, so it's possible to compare the counting activity just before the start of DST with the counting activity just afterwards, and see whether there's a difference. If counts always follow the same pattern in local time, and all counters observe DST at the same time, then that should show up as a rigid shift in the data. That's the theory at least. As a spoiler, I'm doing something a bit different in this post, so it's going to seem a lot more open-ended and exploratory than some of the other ones. When I started coding and writing I didn't entirely know where I was going to end up, so you can think of this as going on a journey with me. Apart from answering the question, my goal with this post is to show you my general approach for dealing with these kinds of questions, and to highlight some of my thoughts along the way. Before I can get started, there's one more thing to get out of the way, namely the fact that DST occurs at different times (if at all) throughout the world. In this post, I've focussed exclusively on DST in the US & Canada[^1], where it starts on the second Sunday in March and ends on the first Sunday of November of each year[^2]. I'll start off with some code to import the relevant packages and load the data. ```{python} # | code-summary: "Code for importing packages and loading data" import os import sqlite3 from calendar import timegm from datetime import datetime, timedelta from pathlib import Path import matplotlib.pyplot as plt import numpy as np import pandas as pd import plotly.express as px import plotly.graph_objects as go import plotly.io as pio from IPython.display import Markdown from rcounting import analysis, plots, units from rcounting.counters import apply_alias pio.templates.default = "seaborn" data_directory = Path(os.getenv("COUNTING_DATA")) db = sqlite3.connect(data_directory / "counting.sqlite") ``` # A first model To see the effect of DST, I can compare the time of day plots for the weeks just before and just after DST is introduced, and see if there are any obvious differences. To maximise the effect of I'll focus only on counts that occurred Monday to Friday, since people's weekday schedules should generally be more regular than their weekends. To start with, I'm going to need some code to find when DST starts and ends on any given year ```{python} #| code-summary: "Finding the start of DST for every year" days = ["monday", "tuesday", "wednesday", "thursday", "friday", "saturday", "sunday"] def find_nth_weekday(year, month, weekday, n): d = datetime(year, month, 1 + 7 * (n - 1)) offset = (days.index(weekday.lower()) - d.weekday()) % 7 return d + timedelta(offset) + timedelta(hours=2) years = range(2012, 2024) dst_start = { year: timegm(find_nth_weekday(year, 3, "sunday", 2).timetuple()) for year in years } dst_end = { year: timegm(find_nth_weekday(year, 11, "sunday", 1).timetuple()) for year in years } ``` I'll also define a bunch of constants that I'm going to need later. ```{python} #| code-summary: "Defining constants for later use" WEEK = 7 * 24 * units.HOUR OFFSET = 5 * units.HOUR BIN_WIDTH = 30 nbins = int(units.DAY / BIN_WIDTH) minval, maxval = -3, 3 BIN_TO_MINUTE = (BIN_WIDTH / 60) time_axis = np.linspace(0, units.DAY, nbins, endpoint=False) + BIN_WIDTH / 2 dy = 0.07 ``` And finally I'll define some code for getting the raw data into slightly more manageable shape, and to calculate a counting rate over time from the timestamps of each count. ```{python} #| code-summary: "Code for getting the raw data into slightly more manageable shape." def wrangle(data, week_map=dst_start): data = data.reset_index(drop=True) data["username"] = data["username"].apply(apply_alias) data["timestamp"] = data["timestamp"] - OFFSET data["date"] = pd.to_datetime(data["timestamp"], unit="s") data['time'] = (data['timestamp']) % units.DAY data["week"] = (np.floor((data["timestamp"] - pd.to_datetime(data["timestamp"], unit="s").dt.year.map(week_map)) / WEEK) .astype(int)) data["year"] = data["date"].dt.year return data def generate_kdes(df): n_weeks = len(df["week"].unique()) kdes = (df .groupby("week")["time"] .apply(lambda x: pd.Series(analysis.fft_kde( x, nbins, kernel="normal_distribution", sigma=0.02)[1] * nbins)) .reset_index()[["week", "time"]]) kdes["time_axis"] = np.hstack(n_weeks*[time_axis]) kdes.columns = ["week", "rate", "time"] kdes["time_date"] = pd.to_datetime(kdes["time"], unit="s") return kdes ``` With that out of the way, I'll select the weeks just before the start of DST and the weeks just after the end of DST for every year in r/counting's history. Then, I can plot the distribution of counts throughout the day for the week before and the week after the introduction of DST, and see how they differ: ```{python} # | code-summary: "Code to plot the DST and DST - 1 distributions" opacity = 0.8 fillcolors = [ "rgba" + x[3:-1] + f", {opacity})" for x in pio.templates[pio.templates.default].layout.colorway ] def query(x): return ( f"select timestamp, username from comments where timestamp between " f"{x + OFFSET + minval * WEEK} and {x + OFFSET + maxval * WEEK} " f"order by timestamp" ) spring_all = wrangle(pd.concat([pd.read_sql(query(x), db) for x in dst_start.values()])) def mask(df): return (df["date"].dt.weekday < 5) & (-2 <= df["week"]) & (df["week"] < 2) spring = spring_all[mask(spring_all)].copy() spring_kdes = generate_kdes(spring) week_map = { -2: "Control without DST", -1: "Without DST", 0: "With DST", 1: "Control with DST", } spring_kdes["week_name"] = spring_kdes["week"].map(week_map) def kde_plot(df, weeks): fig = go.Figure() for week, fillcolor in zip(weeks, fillcolors): frame = df.query("week_name == @week") fig.add_trace( go.Scatter( x=frame["time_date"], y=frame["rate"], fill="tozeroy", name=week, mode="lines", fillcolor=fillcolor, ) ) fig.update_xaxes(tickformat="%H:%M", title="Time of Day (UTC - 5)") fig.update_yaxes(title="Counting rate (arbitrary units)") fig.update_layout(legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01)) return fig fig = kde_plot(spring_kdes, ["Without DST", "With DST"]) fig.show() ``` The shape of the two plots is similar, and it looks like the plot with DST is generally leading the one without, as would be expected if one was just a rigid shift of the other. But it's certainly not a perfect match, and it's hard to see from the curves just how much the DST curve is leading. I can try and see what the optimal shift of the DST curve would be to get it to match the curve without DST. ```{python} # | code-summary: "Code to calculate and plot DST and DST -1 overlaps" def calculate_shifted_overlap(df, week1, week2): fixed = df.loc[df["week_name"] == week2, "rate"].to_numpy() rotating = df.loc[df["week_name"] == week1, "rate"].to_numpy() norm = np.trapz(fixed * rotating, x=time_axis) shifts = [ np.trapz(fixed * np.roll(rotating, i), x=time_axis) / norm for i in range(len(fixed)) ] optimal_shift = (np.argmax(shifts) + nbins / 2) % nbins - nbins / 2 return shifts, optimal_shift shifts, optimal_shift = calculate_shifted_overlap( spring_kdes, "With DST", "Without DST" ) print(f"The optimal shift is {int(optimal_shift * BIN_TO_MINUTE)} minutes.") fig = px.line(x=pd.to_datetime(time_axis, unit="s"), y=shifts) fig.update_xaxes(tickformat="%H:%M", title="Shift (hours)") fig.update_yaxes(title="Similarity score") fig.show() ``` That's a bit less than one hour, but it's still suggestive. Apparently it *is* possible to use the counting data to determine whether or not DST is currently active. So, case closed, right? # Validating the model Not so fast. It could be that there's a shift of one hour every week and DST has nothing to do with it! More seriously, there are other changes happening throughout the time period apart from DST; in the spring the days are getting longer, particularly the evenings, and maybe that's what's driving the change. And I haven't at all looked at what happens when the clocks go back. ## Adding more weeks Let's start by looking at what happens before DST is active. For the preceding analysis to be valid, the distribution of counts throughout the day would need to be basically the same in the two weeks before the start of DST. ```{python} # | code-summary: "Code to plot the DST - 1 and DST - 2 distributions" fig = kde_plot(spring_kdes, ["Without DST", "Control without DST"]) fig.show() ``` Hm. Those two curves might be slightly more aligned than the two with and without DST, but it's not super clear. I can check the optimal shift ```{python} _, optimal_shift = calculate_shifted_overlap(spring_kdes, "Without DST", "Control without DST") print(f"The optimal shift is {int(optimal_shift * BIN_TO_MINUTE)} minutes.") ``` That's an even bigger shift than the one that happened when DST was introduced! I can plot the four curves for the two weeks before and after DST together and see if there's any obvious pattern. ```{python} def plot(df): fig = px.line( df, x="time_date", y="rate", color="week_name", labels={"week_name": "Week", "time_date": "Time"}, ) fig.update_yaxes(title="Counting rate (arbitrary units)") fig.update_xaxes(tickformat="%H:%M", title="Time of Day (UTC - 5)") return fig fig = plot(spring_kdes) fig.show() ``` The plot is very busy but you can show or hide individual traces by clicking on the legend. If you didn't have the legend, would you be able to tell which two of these curves were with DST and which were without? It seems that the variation from week to week is so big that any DST signal that might be present in the data is just swamped by all the noise. ## Including the end of DST I can try and see if including the data for the end of DST makes any difference ```{python} #| label: fig-autumn-kdes #| fig-cap: The aggregated activity on r/counting in the two weeks on either side of the start/end of DST. autumn_all = wrangle(pd.concat([pd.read_sql(query(x), db) for x in dst_end.values()]), dst_end) autumn = autumn_all[mask(autumn_all)].copy() autumn["week"] = -1 - autumn_all["week"] kdes = generate_kdes(pd.concat([spring, autumn])) kdes["week_name"] = kdes["week"].map(week_map) fig = plot(kdes) fig.show() _, optimal_shift = calculate_shifted_overlap(kdes, "With DST", "Without DST") print(f"The optimal shift is {int(optimal_shift * BIN_TO_MINUTE)} minutes.") ``` As before – would you be able to tell which of these graphs were with DST and which were without if you didn't have the legend? ## Summing up The validation of the model has revealed that the activity on r/counting varies enough on a week to week basis that my initial assumptions are incorrect, and I can't just treat the activity as a constant background with a DST signal on top. If I want to see the effect of DST, I'm going to have to come up with something more clever. # More Advanced Models ## Disaggregating the years What I did in the previous section was to aggregate the activity on r/counting across all the years it's been active. After that, I honed in on specific weeks near the time of year when the clocks change, and asked if there was a rigid shift in the data. This analysis revealed that the activity on r/counting isn't stable over time. Maybe I'm losing information by aggregating all the years, and the signal would be clearer if I looked at each year separately. Before I can make these comparisons I'm going to need a way of boiling down the information. @fig-autumn-kdes and friends in the previous section showed that spotting the shift by eye is very difficult, and if the plot is further split into a new line for each year, it's going to become completely unreadable. What I need is a way of compressing each (week, year) pair to a single point, so that the plots are still legible even after disaggregating the years. I can use the fac t that the DST offset is exactly one hour to accomplish just this: For each week, I can calculate how much the distribution resembles that of the week before, and I can also calculate how much the distribution resembles the 1 hour *shifted* distribution from the week before. For most of the year, it should be the case that the unshifted distribution is more similar then the shifted distribution. But, for the week where the clocks change, the shifted distribution should be more similar. So, I can calculate the similarity of the lagged and shifted distribution, and subtract the similarity of just the lagged distribution, and I have a DST fingerprint. For most weeks, it should give a negative value, but for the week where the clocks change it should give a positive value. Let's see how it goes! ```{python} def dst_fingerprint(df, period="spring"): """Calculate the dst fingerprint for a single year""" transitions = dst_start if period == "spring" else dst_end x = df.resample("300s", on="date").size() rates = x.div(x.groupby(pd.Grouper(freq="1d")).transform("sum")).to_frame( name="rate" ) rates.index = rates.index - pd.to_datetime( rates.index.year.map(transitions), unit="s" ) shifted = rates.shift(freq="7d") shift = "-1h" if period == "spring" else "1h" dst_shifted = shifted.shift(freq=shift) dfs = [] for df in [shifted, dst_shifted]: background = pd.Series( (maxval - minval) * [np.nan], range(minval, maxval), name="delta" ) background.index.name = "date" f1 = pd.merge(rates, df, left_index=True, right_index=True) if len(f1) != 0: f1["delta"] = (f1["rate_x"] - f1["rate_y"]) ** 2 series = f1.groupby(f1.index.days // 7)["delta"].sum() background.loc[series.index] = series dfs.append(background) return dfs[1] - dfs[0] def multiple_dst_fingerprints(df, period="spring"): groups = df.groupby("year").apply(dst_fingerprint, period=period) return groups.reset_index().melt(id_vars="year") def scatterplot(df): total = ( df.set_index(["year", "date"])["value"].mul(year_norm).groupby(level=1).sum() ) fig = px.scatter( df, x="date", y="value", color="year", color_continuous_scale="plasma", labels={"value": "DST fingerprint", "year": "Year", "date": "Week"}, ) line = go.Scatter( x=total.index, y=total, line=dict(color="hsl(0, 0%, 40%)"), mode="lines" ) fig.add_trace(line) fig.add_hline(y=0, line_width=2, line_dash="dash", line_color="hsl(0, 0%, 20%)") fig.update_xaxes(title="Weeks after start of DST") fig.update_yaxes(title="DST fingerprint") fig.update_coloraxes(showscale=False) fig.update(layout_showlegend=False) return fig week_norm = ( spring_all.groupby(["year", "username"]).size() / spring_all.groupby("year").size() ) year_norm = spring_all.groupby("year").size() / len(spring_all) df = multiple_dst_fingerprints(spring_all).dropna() fig = scatterplot(df) fig.show() ``` Hm. This isn't very promising. The DST signal should show up in this plot in the fact that the points at 0 should lie significantly higher than all the others. That's not the case at all. I can do the same thing for when DST ends, just for good measure, to see if the signal shows up there: ```{python} df = multiple_dst_fingerprints(autumn_all, "autumn").dropna() fig = scatterplot(df) fig.update_xaxes(title="Weeks after end of DST") fig.show() ``` Unfortunately, that didn't seem to show the signal either. Before giving up completely and abandoning this as a fool's errand, there's one or two more things I can try. ## Disaggregating the different counters Regulars of r/counting will know that it's not the same people who count every week. If proof of this is needed, you can take a look at the top weekly counters list and see that it really isn't just a repeat from week to week. Perhaps this is one cause of the lack of pattern in the counting times. It's certainly possible to imagine a world where counters are perfectly regular, but the different schedules of different counters coupled with their different activity from week to week adds up to a huge mess. So I can keep going with the disaggregation, and see if I get a clearer signal when we compare the activity of individual counters from week to week. ```{python} def username_fingerprint(df, period="spring"): fingerprint = (df.groupby(["year", "username"]) .apply(dst_fingerprint, period=period) .reset_index() .melt(id_vars=["year", "username"], var_name="date") .set_index(["year", "username", "date"])["value"]) return fingerprint fingerprints = username_fingerprint(spring_all) df = ( fingerprints.dropna() .mul(week_norm) .groupby(level=[0, 2]) .sum() .reset_index(name="value") ) fig = scatterplot(df) fig.show() ``` Looking at this graph almost makes me think I have a sign error in the way I've defined the DST fingerprint. I've double checked, and I don't think it's the case, but this certainly isn't the peak at 0 I was hoping to see. ## Looking only at the most regular counters None of what I've tried so far has seemed to work. There's one last thing I can try: I can find out which counters were most regular in the period leading up to the start of DST each year, and only include them in the calculations To do that I'll need to slightly rework the code from [*Section 1*]{.spurious-link target="A first model"} for calculating the overlap between two different counting distributions. This will let me calculate the overlap for every counter for every year in the weeks around the onset of DST. Then it's just a bit of fidding with indices to find the 5 most regular counters two weeks before the start of DST for every year. ```{python} def similarity_score(df): kdes = generate_kdes(df) groups = kdes.groupby("week")["rate"] norm = groups.transform(np.linalg.norm) kdes["rate"] /= norm overlaps = ( (kdes.set_index(["week", "time"]).groupby("time")["rate"].diff() ** 2) .groupby(level=0) .sum() ) return 1 - overlaps / 2 scores = spring_all.groupby(["year", "username"]).apply(similarity_score) scores = scores[scores != 1] counters = ( scores .reset_index() .query("week== -2") .sort_values(["year", "rate"], ascending=False) .groupby("year") .head(5) .set_index(["year", "username"]) .index ) subset = spring_all.set_index(["year", "username"]).loc[counters] ``` With that out of the way, I can plot the average similarity score for each year and week ```{python} week_norm = subset.groupby(["year", "username"]).size() / subset.groupby("year").size() year_norm = subset.groupby("year").size() / len(subset) similarity = ( scores.reset_index(level=2) .loc[counters] .set_index("week", append=True)["rate"] .mul(week_norm) .groupby(level=[0, 2]) .sum() .reset_index(name="value") ) fig = scatterplot(df) fig.update_yaxes(title="Week consistency score") fig.show() ``` I've decided to work with the consistency score rather than the DST fingerprint[^3], because I wanted to highlight the effect of the selection I've made. You can see on the graph that the score two weeks before the start of DST is signficantly higher than all the other weeks, and in particular the least consistent year for week -2 is much more consistent tht the least consistent year for all the other weeks. This apparent result is just an artefact of the way I've selected the data: I've limited my search to people who were very consistent two weeks before the onset of DST, so it's no surprise that the consistency is high here. The fact that the consistency is lower in the following weeks is due to a regression towards the mean. That statistical artefact aside, it doesn't seem that this analysis has brought me any closer to finding a clear sign of DST in the counting data. With the consistency score, the sign of DST is a dip at 0, so the fact that the least consistent week is the week DST starts is suggestive. But it's not what I'd call proof. # Conclusion If you want to find out whether or not the US currently has DST, then looking at the comments on r/counting is not a viable method for doing so. I would suggest just googling it instead. This post ended up being much longer than expected (and a fair bit longer than the reddit comment that it's based on), mainly because I've had to change the conclusion along the way. In the original, and in my first draft, I wasn't as thorough with my robustness analysis as I've been here. That meant that I was more convinced by the hints of a DST signal in the data, and the conclusion reflected that. Unfortunately, this post has demonstrated that it just isn't there. On the positive side, the post has also demonstrated the value of checking assumptions, validating any model that you might come up with, and generally having a healthy dose of skepticism towards any new discoveries – especially your own. And that's perhaps as good a place as any to end. Until next time! [^1]: Apart from Hawaii and Arizona, which are weird [^2]: That hasn't always been the DST rule, but it's been the case for as long as r/c has existed [^3]: The two scores behave similarly, with the exception that for the consistency score, we'd expected DST to show up as a dip at zero, rather than a peak.