انتقل إلى المحتوى الرئيسي

6 مقالات موسومة ب "killed-in-gaza"

عرض كل الوسوم

· 3 دقائق قراءة

Our work over the past year started with aggregating the statistical reports coming out of Gaza's Ministry of Health & Government Media Office to help make their work more accessible.

It's becoming clear that the numbers we've continued to report verbatim from official sources are a small slice of the overall picture of human suffering over the past year. It's important to highlight that the official numbers we publish here, and that nearly every media outlet and government have trusted, are restricted to:

  • only those deaths that can be identified, and
  • only those deaths that can be linked back to an act of Israeli aggression

Any Gazans known to have been killed and who didn't make it to health authorities, or who are long missing and presumed dead, are generally not included in the ministry reporting. Even those names submitted by the public must follow a rigorous confirmation process through a committee to confirm the circumstances of their death (we expand on this in our dataset source documentation).

The tireless work of Gaza's authorities is admirable and extremely important, but it's a very focused view of the impacts of Israel's actions. For a more realistic picture one must use these official numbers as one of many inputs alongside other important factors like:

  • malnutrition
  • infectious diseases
  • maternal & neonatal health problems
  • untreated or unsupported pre-existing conditions
  • unrecovered or buried bodies (more than 60% of Gaza's buildings are damaged or destroyed)

The toll of these combined factors are believed to far exceed the official numbers & names we've been reporting. Consider the following estimates provided by researchers:

The following was mentioned in the appendix to an October 2, 2024 letter to the US administration from medical professionals citing BMC Medicine research:

Deaths from military violence are usually the smaller share, and indeed civilian excess mortality in wars can be 25 times higher than deaths from violence.

By this measure, if you take Israel's own estimate that "at least 17,000" of the reported deaths were Hamas fighters, that implies almost half a million in excess civilian mortality.

From The Lancet, a medical and public health journal:

“it is not implausible to estimate that up to 186,000 or even more deaths could be attributable to the current conflict in Gaza.”

From the above-noted appendix (page 6) based on IPC reports of severe food insecurity:

In total it is likely that 62,413 people have died of starvation and its compilations in Gaza from October 7, 2023 to September 30, 2024. Most of these will have been young children.


It's important to note that the vast majority of these estimated deaths would not be included in the daily casualties reports or periodical names list we've published here.

We will continue to aggregate and publish the reports from authorities in Gaza as we have, though we will be making more of an effort to put those numbers in context and with clearer caveats. We encourage all of those who rely on what we publish to take the same level of care.

· 2 دقائق قراءة

Around September 15th, Gaza officials released a 649 page PDF on their WhatsApp channel representing the list of those killed in Gaza up to August 31st.

The list format was very similar to recent updates we received on the Ministry of Health's telegram channel. You can download the PDF here that was used as the source for this update. You can view the updated dataset on our Killed in Gaza page.

We noted the following changes to their reporting format from the last update:

  • they've added a new source value to each record labeled either "judicial committee" or "house committee" which accounted for less than 5% of the records - the vast majority are still attributed to the Ministry of Health

Methodology

We maintained the update method we last reported and continue to perform the following adjustments to the source report:

  • normalizing the date of birth date format to YYYY-MM-DD
  • converting the source field from arabic to an english abbreviation
  • converting the sex/gender field from arabic to an english abbreviation
  • adding the english name translation using our existing lookup tables

Change Summary

The following tables summarize the updated demographics of the Killed in Gaza list following this update:

DemographicNumber%
Men14,34741.8%
Women6,64319.3%
Boys6,41918.7%
Girls4,93614.4%
Senior Men1,2083.5%
Senior Women7912.3%
Total Persons34,344

Of the children in this list (33% of the total), the following is the breakdown by age group:

Child Age GroupNumber%
Teen Boy (Under 18)2,57922.7%
Teen Girl (Under 18)1,52113.4%
Pre-teen Boy (Under 12)2,85125.1%
Pre-teen Girl (Under 12)2,48021.8%
Toddler Boy (Under 3)6305.6%
Toddler Girl (Under 3)5845.1%
Baby Boy (Under 1)3593.2%
Baby Girl (Under 1)3513.1%
Total Children11,355

· 3 دقائق قراءة

On July 24, we received an updated list of names of those killed in Gaza up to June 30th. We've incorporated this list in its entirety, replacing our prior list.

The new list was in a PDF format like the recent updates. You can download the Ministry of Health PDF here that was used as the source for this update.

We noted the following changes to their reporting format from the last update:

  • they added back the "source" column we previously used to track public submissions vs. official sourcing, so we've accepted these values which may overwrite values previously reported as "unknown" for this column in our prior list
  • they added back the date of birth field and kept the age

Changes to Update Methodology

In the past we accepted a subset of records in order to avoid having to verify each update, by not accepting record changes which were drastically different (by some arbitrary change threshold, like for the arabic name for example). This kind of reconciliation has diminishing benefits now that the ministry's reporting format has mostly standardized. We also want to defer to the official source of truth, rather than introduce our own decisions to their processes. For that reason you may notice the following main differences in output:

  • where we generated IDs for records where that field was missing, the field will now be empty - we did not consider this a breaking change given the vast majority of records have IDs
  • most records saw small age changes reflecting our wholesale acceptance of their record values - in the past we adjusted these based on the availability of data in prior reports and validation issues we saw when comparing date of births, but the reference date we used may have accounted for a discrepancy as well

Otherwise the main changes we make to their reporting include:

  • normalizing the date of birth date format to YYYY-MM-DD
  • converting the source field from arabic to an english abbreviation
  • adding the english name translation using our existing lookup tables

Change Summary

We've issued this update on two different dates, a partial list update in August and one in early September.

The following tables summarize the updated demographics of the Killed in Gaza list following the latest update on September 7th, 2024:

DemographicNumber%
Senior Men9823.5%
Senior Women6552.3%
Men11,58341.1%
Women5,61419.9%
Boys5,20218.5%
Girls4,14914.7%
Total Persons28,185

· 4 دقائق قراءة

On May 5th, we received an updated list of names of those killed in Gaza up to April 30th. We've incorporated new records and existing record changes from that update.

The new list was in a PDF format that differed slightly from the prior lists that were distributed in CSV & PDF format. You can download the Ministry of Health PDF here that was used as the source for this update.

We noted the following changes to their reporting format:

  • they dropped the "source" column we previously used to track public submissions vs. official sourcing, so we added a new "unknown" value to our existing source field if a new record has no source attribute (but we've kept the old value if the record already existed)
  • they removed the date of birth field and only reported age
  • a number of records did not have an identifier, so we generated one based on the report date and their reported index (prefixed with missing-)

Change Summary

The following tables summarize the demographic changes in our Killed in Gaza list following its merge with the abovenoted Ministry list:

Demographics of Our List Before Merge

DemographicNumber%
Senior Men5662.8%
Senior Women4102.0%
Men8,11539.8%
Women4,59722.5%
Boys3,40216.7%
Girls2,95614.5%
Male (no age)1820.9%
Female (no age)1620.8%
Total Persons20,390

Demographics of Newly Added Records

DemographicNumber%
Senior Men1994.5%
Senior Women1152.6%
Men2,39653.6%
Women64714.5%
Boys63014.1%
Girls3918.8%
Male (no age)661.5%
Female (no age)220.5%
Total Persons4,466

Demographics of Our Updated List After Merge

DemographicNumber%
Senior Men7603.1%
Senior Women5242.1%
Men10,42442.3%
Women5,20421.1%
Boys4,00516.2%
Girls3,33013.5%
Male (no age)2441.0%
Female (no age)1810.7%
Total Persons24,672

Demographics of Removed Records

184 records in our prior list were not present in the latest list release (by identification number) so we removed them.

DemographicNumber%
Senior Men52.7%
Senior Women10.5%
Men8847.8%
Women4021.7%
Boys2714.7%
Girls189.8%
Male (no age)31.6%
Female (no age)21.1%
Total Persons184

We believe the higher ratio of Men in this revised list reflects the addition of community-reported sourcing. These records likely include more of those lost or missing for which remains were not received by health authorities as was the case for most of the records in the initial list distributed in January.

Merge Methodology / Commentary

Our methodology for updating existing records and accepting new ones didn't change and we detailed our approach in our prior April 13th update.

Where there were changes in names for existing records by identification ID within our accepted threshold of 30%, the breakdown was as follows:

change % upper boundnumber of occurrences
0%129
10%406
20%129
30%24

(the change threshold upper bound means that 20% would include a 12% or 18% change to the original name)

In terms of overall types of record changes across those already in our list at the time of merge, the breakdown was as follows:

fields affectednumber of occurences
Name676
None (Duplicate)399
Age and Name12
Only Age8

· 6 دقائق قراءة

On April 3rd, we received an updated list of names of those killed in Gaza up to March 29th. We've incorporated new records and existing record changes from that update.

The new list was in a PDF format that differed slightly from the initial lists that were distributed in CSV format. It also included the source of the record being one of either the Ministry of Health ("سجالت وزارة الصحة"), or a submission made from the public ("تبيلغ ذوي الشهداء"). You can download the Ministry of Health PDF here. You can also download our Killed in Gaza list from before in CSV format here to compare how individual records may have changed.

We've added a new source field to the records to indicate the reported source of the record as noted above.

Change Summary

The following tables summarize the demographic changes in our Killed in Gaza list following its merge with the abovenoted Ministry list:

Demographics of Our List Before Merge

DemographicNumber%
Men4,59432.5
Women3,14722.3
Boys2,54518.0
Girls2,24715.9
Senior Men3282.3
Senior Women2822.0
Male (no age)5654.0
Female (no age)4323.1
Total Persons14,140

Demographics of Newly Added Records

DemographicNumber%
Men3,16349.1
Women1,17918.3
Boys92914.4
Girls73511.4
Sr. Men2333.6
Sr. Women1241.9
Male (no age)510.8
Female (no age)310.5
Total Persons6,445

Demographics of Our Updated List After Merge

DemographicNumber%
Men8,11539.8
Women4,59722.5
Boys3,40216.7
Girls2,95614.5
Sr. Men5662.8
Sr. Women4102.0
Male (no age)1820.9
Female (no age)1620.8
Total Persons20,390

Demographics of Removed Records

195 records in our prior list were not present in the latest list release (by identification number) so we removed them.

DemographicNumber%
Men6332.3
Women2713.8
Boys3316.9
Girls115.6
Sr. Men42.1
Sr. Women21.0
Male (no age)4322.1
Female (no age)126.2
Total Persons195

We believe the higher ratio of Men in this revised list reflects the addition of community-reported sourcing. These records likely include more of those lost or missing for which remains were not received by health authorities as was the case for most of the records in the initial list distributed in January.

Merge Methodology / Commentary

Incorporating the new list required a few steps:

  1. Parse the tabular data from the PDF
  2. Clean the parsed data for data format inconsistencies
  3. Render the data in a format comparable to our existing list
  4. Reconcile record conflicts & changes
  5. Merge and rewrite our existing source list

Commentary on our approach for some of the steps follows.

Cleaning the Data

At this stage we worked to determine common issues with the parsed data and found the following cases:

  • date of birth formats were not standardized (ie: long year vs. short year)
    • we worked to normalize these, and if the format was hard to decipher we validated against the provided age
  • age was sometimes repeated in both the age and date of birth columns (no date of birth)
    • we removed any age values from the date of birth column
  • identification number field sometimes had non-number values or was clearly invalid
    • we dropped these records
  • date of birth field was full of hashes (#)
    • we removed these and left the date of birth empty

This was an iterative process of gathering stats, updating cleaning logic, and reviewing the output in our standard format to assess how to repeat with refined logic.

Reconciling Conflicts & Changes

We focused on assessing record conflicts based on the provided identification number only. If our existing list had a record with the same identification value, we checked the field changes (the "diff") to determine whether the change was acceptable using the following methodology:

  • if the age only changed by a year, we allowed the change as it's likely a reference date or rounding issue (the initial Ministry list was provided in a form that had an unfixed reference date of the current day and our prior list fixed that to January 5, 2024 per source dating)
  • if a comparison of names using Levenshtein Distance led to a change amounting to less than 30% of the original name's length, we allowed the change, but only if the new name didn't rely more on our fallback auto translation library than it did before
  • if an age or date of birth was not on the existing record and it was on the incoming one, we accepted it

This process helped us narrow in on specific record sets to refine our approach.

Where there were changes in names for existing records by identification ID within our accepted threshold of 30%, the breakdown was as follows:

change % upper boundnumber of occurrences
0%3,255
10%4,506
20%518
30%56

(the change threshold upper bound means that 20% would include a 12% or 18% change to the original name)

In terms of overall types of record changes across those already in our list at the time of merge, the breakdown was as follows:

fields affectednumber of occurences
Name6,158
None (Duplicate)4,089
Age and Name2,113
Only Age1,557
Age, Birth Date, and Name10
Age and Birth Date12
Birth Date and Name1

· 3 دقائق قراءة

We've made some significant changes to our previously published Killed in Gaza list, which has the names of those known to have been killed in Gaza since October 7th. This post provides more detail on our new methodology and what to expect about the changes.

Prior Method

Our prior list relied heavily on an existing library (arabic-names-to-en) which first tried to translate a name segment using a dictionary mapping, then fell back to a character-by-character lookup. We then had some volunteers do a visual review and incorporated manual changes. For a list of over 14 thousand names, this proved hard to manage.

New Method

We've since built our own dictionary mapping with more name coverage, and the process now looks like this:

  1. we clean arabic names in the original list of formatting issues (using dict_ar_ar.csv)
  2. we lookup / translate each name part into english (using dict_ar_en.csv)
  3. we run final transformations when converting to JSON (see JSON export script)

The final step includes a fallback step to rely on the old library for remaining arabic translations that are not yet in our curated dict_ar_ar.csv. Currently there are less than 2% of the names partially handled by this fallback mechanism, and we'll be working to reduce that number.

Notable Changes

We've avoided what we believe would have been breaking changes to the dataset per our versioning guide, but we did add 21 new records from the original official list released in November 2023. The IDs that were introduced from that November list include:

  • 401771530
  • 401844790
  • 405424524
  • 407194836
  • 411518053
  • 425923364
  • 436788202
  • 437391725
  • 438240293
  • 438445371
  • 441199296
  • 800328817
  • 802335927
  • 803827518
  • 804662112
  • 804669000
  • 901494161
  • 930025457
  • 932076094
  • 942125832
  • 95270068

The list before this change can be found on Github:

Here are some additional details about the current list & the latest revision:

  • there are 14,140 names
  • english name changes between this and the last published list, using Levenshtein distance:
    • 24% of names had no change
    • 60% of names had differences of between 1-4 edits, inclusive
    • 15% of names had differences of between 5-9 edits, inclusive
    • 1.9% of names had differences of 10 or more edits
  • 92 records (0.65%) had age changes from the prior release (all 1 year less than before)
  • 29 names have "unknown" for part or all of the name, and those are now represented in the english translation as ?

We're continually working to improve translations and the list in general. If you have ideas or want to contribute a change, please see our contributing guide.