Skip to content

Wait longer for wikitext history for dumps backfill.

Xcollazo requested to merge fix-sensor-timeout-mediawiki-dumps-backfill into main

Currently, when the set of DAGs for Dumps 2.0 fail, they do not send a message to data-engineering-alerts, since this code is still in progress. Instead, they send an email directly to @xcollazo (example).

I received the following messsage:

Airflow alert: <TaskInstance: dumps_merge_backfill_to_wikitext_raw.wait_for_data_in_mw_wikitext_history scheduled__2024-03-01T00:00:00+00:00 [failed]>

Try 1 out of 6 Exception: Sensor has timed out; run duration of 606100.482329 seconds exceeds the specified timeout of 604800.0.

This timed out because mediawiki_wikitext_history, which we use for backfilling, takes a long while to be populated for the month.

In this MR, we fix the sensor to wait up to 24 days, just like in the DAG that generates mediawiki_wikitext_history.

Edited by Xcollazo

Merge request reports