Updates | Palestine Datasets

Killed in Gaza 06-30 Update

September 7, 2024 · 3 min read

On July 24, we received an updated list of names of those killed in Gaza up to June 30th. We've incorporated this list in its entirety, replacing our prior list.

The new list was in a PDF format like the recent updates. You can download the Ministry of Health PDF here that was used as the source for this update.

We noted the following changes to their reporting format from the last update:

they added back the "source" column we previously used to track public submissions vs. official sourcing, so we've accepted these values which may overwrite values previously reported as "unknown" for this column in our prior list
they added back the date of birth field and kept the age

Changes to Update Methodology

In the past we accepted a subset of records in order to avoid having to verify each update, by not accepting record changes which were drastically different (by some arbitrary change threshold, like for the arabic name for example). This kind of reconciliation has diminishing benefits now that the ministry's reporting format has mostly standardized. We also want to defer to the official source of truth, rather than introduce our own decisions to their processes. For that reason you may notice the following main differences in output:

where we generated IDs for records where that field was missing, the field will now be empty - we did not consider this a breaking change given the vast majority of records have IDs
most records saw small age changes reflecting our wholesale acceptance of their record values - in the past we adjusted these based on the availability of data in prior reports and validation issues we saw when comparing date of births, but the reference date we used may have accounted for a discrepancy as well

Otherwise the main changes we make to their reporting include:

normalizing the date of birth date format to YYYY-MM-DD
converting the source field from arabic to an english abbreviation
adding the english name translation using our existing lookup tables

Change Summary

We've issued this update on two different dates, a partial list update in August and one in early September.

The following tables summarize the updated demographics of the Killed in Gaza list following the latest update on September 7th, 2024:

Demographic	Number	%
Senior Men	982	3.5%
Senior Women	655	2.3%
Men	11,583	41.1%
Women	5,614	19.9%
Boys	5,202	18.5%
Girls	4,149	14.7%
Total Persons	28,185

Gaza Daily Reporting Period

August 17, 2024 · One min read

The Gaza Ministry of Health has started reporting some days for 48 hour periods on a consistent basis, with the last two dates being August 17th (with no report on August 16th) and August 12th (with no report on August 11th).

In the event of a skipped reporting date like the scenarios noted above, you'll see a new report_period field indicating 48 hours for the consolidated report and 0 hours for the skipped report. For example: August 17th would have 48 hours and August 16th would have 0 hours. We're maintaining the skipped report dates for consistency and to reflect the cumulative values on those dates regardless of a report presence.

All existing report dates covering 24 hour periods will have a value of 24 for report_period.

Killed in Gaza 04-30 Update

June 26, 2024 · 4 min read

On May 5th, we received an updated list of names of those killed in Gaza up to April 30th. We've incorporated new records and existing record changes from that update.

The new list was in a PDF format that differed slightly from the prior lists that were distributed in CSV & PDF format. You can download the Ministry of Health PDF here that was used as the source for this update.

We noted the following changes to their reporting format:

they dropped the "source" column we previously used to track public submissions vs. official sourcing, so we added a new "unknown" value to our existing source field if a new record has no source attribute (but we've kept the old value if the record already existed)
they removed the date of birth field and only reported age
a number of records did not have an identifier, so we generated one based on the report date and their reported index (prefixed with missing-)

Change Summary

The following tables summarize the demographic changes in our Killed in Gaza list following its merge with the abovenoted Ministry list:

Demographics of Our List Before Merge

Demographic	Number	%
Senior Men	566	2.8%
Senior Women	410	2.0%
Men	8,115	39.8%
Women	4,597	22.5%
Boys	3,402	16.7%
Girls	2,956	14.5%
Male (no age)	182	0.9%
Female (no age)	162	0.8%
Total Persons	20,390

Demographics of Newly Added Records

Demographic	Number	%
Senior Men	199	4.5%
Senior Women	115	2.6%
Men	2,396	53.6%
Women	647	14.5%
Boys	630	14.1%
Girls	391	8.8%
Male (no age)	66	1.5%
Female (no age)	22	0.5%
Total Persons	4,466

Demographics of Our Updated List After Merge

Demographic	Number	%
Senior Men	760	3.1%
Senior Women	524	2.1%
Men	10,424	42.3%
Women	5,204	21.1%
Boys	4,005	16.2%
Girls	3,330	13.5%
Male (no age)	244	1.0%
Female (no age)	181	0.7%
Total Persons	24,672

Demographics of Removed Records

184 records in our prior list were not present in the latest list release (by identification number) so we removed them.

Demographic	Number	%
Senior Men	5	2.7%
Senior Women	1	0.5%
Men	88	47.8%
Women	40	21.7%
Boys	27	14.7%
Girls	18	9.8%
Male (no age)	3	1.6%
Female (no age)	2	1.1%
Total Persons	184

We believe the higher ratio of Men in this revised list reflects the addition of community-reported sourcing. These records likely include more of those lost or missing for which remains were not received by health authorities as was the case for most of the records in the initial list distributed in January.

Merge Methodology / Commentary

Our methodology for updating existing records and accepting new ones didn't change and we detailed our approach in our prior April 13th update.

Where there were changes in names for existing records by identification ID within our accepted threshold of 30%, the breakdown was as follows:

change % upper bound	number of occurrences
0%	129
10%	406
20%	129
30%	24

(the change threshold upper bound means that 20% would include a 12% or 18% change to the original name)

In terms of overall types of record changes across those already in our list at the time of merge, the breakdown was as follows:

fields affected	number of occurences
Name	676
None (Duplicate)	399
Age and Name	12
Only Age	8

World Press Freedom Day

May 2, 2024 · 6 min read

May 3rd is World Press Freedom Day. It acts as a reminder to governments of the need to respect commitments to press freedom. It is an opportunity to:

celebrate the fundamental principles of press freedom;
assess the state of press freedom throughout the world;
defend the media from attacks on their independence;
and pay tribute to journalists who have lost their lives in the line of duty

As of today and according to government sources in Gaza, 141 journalists & media workers have made the ultimate sacrifice in their efforts documenting what's happening in Gaza. Israel and its military have shown time and again that they actively target these essential workers and their families to prevent their stories from getting out to the world.

Their names follow below (or download the full CSV).

Aaid al-Najar • عائد النجار
Aasm al-Barash • عاصم البرش
Aasm Kamal Mousa • عاصم كمال موسى
Abd Aodah • عبدالكريم عودة
Abdal-hadi Habib • عبدالهادي حبيب
Abdalihamaid Aliqurinaoi • عبدالحميد القريناوي
Abdalrahamun Saymaa • عبدالرحمن صايمة
Abdalrahamun Shehab • عبدالرحمن شهاب
Abdul-halim Aodh • عبدالحليم عوض
Abdul-wahab Aoni Abu Aun • عبدالوهاب عوني أبو عون
Abdullah Alwan • عبدالله علوان
Abdullah Bares • عبدالله بريص
Abdullah Daroish • عبدالله درويش
Adel Zarab • عادل زعرب
Adham Husaonah • أدهم حسونة
Ahmed Abu Absah • أحمد أبو عبسة
Ahmed Abu Mhadi • أحمد أبو مهادي
Ahmed Aliqura • أحمد القرا
Ahmed Badir • أحمد بدير
Ahmed Fatima • أحمد فطيمة
Ahmed Jamal al-Madhoun • أحمد جمال المدهون
Ahmed Khair Addeen • أحمد خير الدين
Ahmed Shehab • أحمد شهاب
Alaa Aataallah • علا عطاالله
Alaa Hassan Alhmus • آلاء حسن الهمص
Alaaa Alnmar • علاء النمر
Ali Ashour • علي عاشور
Ali Nisaman • علي نسمان
Ali Salem Abu Ajwa • علي سالم أبو عجوة
Amal Zuhad • أمل زهد
Amaro Abu Haya • عمرو أبو حية
Anas Abu Shamala • أنس أبو شمالة
Angham Ahmed Adwan • أنغام أحمد عدوان
Assad Shmlkh • أسعد شملخ
Ayat Khadoura • آيات خضورة
Bilal Jadallah • بلال جادالله
Dr. Mahmoud Emad Issa • د. محمود عماد عيسى
Dr. Rizq Alghrablai • د. رزق الغرابلي
Dr. Zaid Abu Zaid • د. زيد أبو زايد
Duaa Jabbour • دعاء الجبور
Duaa Shraf • دعاء شرف
Emad Alohaidi • عماد الوحيدي
Essam al-Loulo • عصام اللولو
Essam Bhar • عصام بهار
Eyad Ahmed Alrawagh • إياد أحمد الرواغ
Eyad Motar • اياد مطر
Fawad Abu Khamish • فؤاد أبو خماش
Hamadah al-Yaziji • حمادة اليازجي
Hamzah Wael Aldhdoh • حمزة وائل الدحدوح
Hanan Aiad • حنان عياد
Hani al-Madhoun • هاني المدهون
Hanin Aliqtshan • حنين القطشان
Hassan Farajallah • حسان فرجالله
Heba Alabdlah • هبة العبادلة
Hitham Harara • هيثم حرارة
Hsham Alnwajhah • هشام النواجحة
Hthifah al-Najar • حذيفة النجار
Hthifah Lolo • حذيفة لولو
Husam Ammar • حسام عمار
Husam Mubarak • حسام مبارك
Husaonah Esleem • حسونة إسليم
Ibrahem Lafii • ابراهيم لافي
Ikram al-Shafi'i • أكرم الشافعي
Iman al-Aqailai • إيمان العقيلي
Jabr Abu Hadaros • جبر أبو هدروس
Jamal Alfaqaaoi • جمال الفقعاوي
Jamal Hniah • جمال هنية
Khalil Abu Aathra • خليل أبو عاذرة
Mahmoud Abu Tharefa • محمود أبو ظريفة
Mahmoud Moshtaha • محمود مشتهى
Mahmoud Motar • محمود مطر
Mahmoud Salem • محمود سالم
Majed Kshko • ماجد كشكو
Marawan Alswaf • مروان الصواف
Mjad Arnads • مجد عرندس
Mousa al-Barash • موسى البرش
Mshal Shahwan • مشعل شهوان
Muhammad Abdal-fattah Aataallah • محمد عبدالفتاح عطاالله
Muhammad Abdalkhaliq Alaf • محمد عبدالخالق العف
Muhammad Abu Dayra • محمد أبو داير
Muhammad Abu Hassira • محمد أبو حصيرة
Muhammad Abu Hatab • محمد أبو حطب
Muhammad Abu huwaidi • محمد أبو هويدي
Muhammad Abu Motar • محمد أبو مطر
Muhammad Abu Rizq • محمد أبو رزق
Muhammad Abu Sakheel • محمد أبو سخيل
Muhammad Abu Samra • محمد أبو سمرة
Muhammad Aiish • محمد عياش
Muhammad al-Jaja • محمد الجاجة
Muhammad al-Salhi • محمد الصالحي
Muhammad Albiari • محمد البياري
Muhammad Alhasani • محمد الحسني
Muhammad Ali • محمد علي
Muhammad Alrifi • محمد الريفي
Muhammad Alsaid Abu Sakheel • محمد السيد أبو سخيل
Muhammad Alshoarabji • محمد الشوربجي
Muhammad Althalatheeny • محمد الثلاثيني
Muhammad Alzq • محمد الزق
Muhammad Baloshah • محمد بعلوشة
Muhammad Bsam Aljml • محمد بسام الجمل
Muhammad Farajallah • محمد فرجالله
Muhammad Jarghoun • محمد جرغون
Muhammad Khair Addeen • محمد خير الدين
Muhammad Khidr Salama • محمد خضر سلامة
Muhammad Labad • محمد لبد
Muhammad Raslan Shanyoura • محمد رسلان شنيورة
Muhammad Tishreen Yaghi • محمد تشرين ياغي
Muhammad Yunus al-Zaytouniyah • محمد يونس الزيتونية
Muntsar Alswaf • منتصر الصواف
Musab Abu Zaid • مصعب أبو زايد
Musab Ashour • مصعب عاشور
Mustafa Alnqaib • مصطفى النقيب
Mustafa Alswaf • مصطفى الصواف
Mustafa Bkir • مصطفى بكير
Mustafa Thraia • مصطفى ثريا
Nadr Alnazlai • نادر النزلي
Nafath Abd al-Jawad • نافذ عبد الجواد
Nizmai Alnadiam • نظمي النديم
Nrmain Qawas • نرمين قواس
Rajab Alnqaib • رجب النقيب
Rami Badir • رامي بدير
Rushdi Alsraj • رشدي السراج
Saeed Altoil • سعيد الطويل
Salam Mema • سلام ميمة
Salma Mukhiamar • سلمى مخيمر
Samaih al-Nadi • سميح النادي
Samer Abu Daqqa • سامر أبو دقة
Sari Munasoar • ساري منصور
Sayed Halabi • سائد حلبي
Shaimaa Aljzar • شيماء الجزار
Sharif Akasha • شريف عكاشة
Tariq Alsaid Abu Sakheel • طارق السيد أبو سخيل
Wael Rajab Abu Fanonna • وائل رجب أبو فنونة
Yahya Abu Manih • يحيى أبو منيع
Yaqob al-Barash • يعقوب البرش
Yasr Abu Namoos • ياسر أبو ناموس
Yasr Mmdoh • ياسر ممدوح
Yazin Alzoidi • يزن الزويدي
Zahir al-Afghani • زاهر الأفغاني

You can read more about some of these individuals and how they lived, worked, and were killed, through the Committee to Protect Journalists who have a searchable database and Israel/Gaza tracker page.

You can visit the Press Killed in Gaza dataset page to learn more about how we source updates and find other ways to work with the list. We'll continue to update the dataset as long as the attacks are ongoing and the Gaza Media Office provides updates.

Killed in Gaza 03-29 Update

April 29, 2024 · 6 min read

On April 3rd, we received an updated list of names of those killed in Gaza up to March 29th. We've incorporated new records and existing record changes from that update.

The new list was in a PDF format that differed slightly from the initial lists that were distributed in CSV format. It also included the source of the record being one of either the Ministry of Health ("سجالت وزارة الصحة"), or a submission made from the public ("تبيلغ ذوي الشهداء"). You can download the Ministry of Health PDF here. You can also download our Killed in Gaza list from before in CSV format here to compare how individual records may have changed.

We've added a new source field to the records to indicate the reported source of the record as noted above.

Change Summary

The following tables summarize the demographic changes in our Killed in Gaza list following its merge with the abovenoted Ministry list:

Demographics of Our List Before Merge

Demographic	Number	%
Men	4,594	32.5
Women	3,147	22.3
Boys	2,545	18.0
Girls	2,247	15.9
Senior Men	328	2.3
Senior Women	282	2.0
Male (no age)	565	4.0
Female (no age)	432	3.1
Total Persons	14,140

Demographics of Newly Added Records

Demographic	Number	%
Men	3,163	49.1
Women	1,179	18.3
Boys	929	14.4
Girls	735	11.4
Sr. Men	233	3.6
Sr. Women	124	1.9
Male (no age)	51	0.8
Female (no age)	31	0.5
Total Persons	6,445

Demographics of Our Updated List After Merge

Demographic	Number	%
Men	8,115	39.8
Women	4,597	22.5
Boys	3,402	16.7
Girls	2,956	14.5
Sr. Men	566	2.8
Sr. Women	410	2.0
Male (no age)	182	0.9
Female (no age)	162	0.8
Total Persons	20,390

Demographics of Removed Records

195 records in our prior list were not present in the latest list release (by identification number) so we removed them.

Demographic	Number	%
Men	63	32.3
Women	27	13.8
Boys	33	16.9
Girls	11	5.6
Sr. Men	4	2.1
Sr. Women	2	1.0
Male (no age)	43	22.1
Female (no age)	12	6.2
Total Persons	195

Merge Methodology / Commentary

Incorporating the new list required a few steps:

Parse the tabular data from the PDF
Clean the parsed data for data format inconsistencies
Render the data in a format comparable to our existing list
Reconcile record conflicts & changes
Merge and rewrite our existing source list

Commentary on our approach for some of the steps follows.

Cleaning the Data

At this stage we worked to determine common issues with the parsed data and found the following cases:

date of birth formats were not standardized (ie: long year vs. short year)
- we worked to normalize these, and if the format was hard to decipher we validated against the provided age
age was sometimes repeated in both the age and date of birth columns (no date of birth)
- we removed any age values from the date of birth column
identification number field sometimes had non-number values or was clearly invalid
- we dropped these records
date of birth field was full of hashes (#)
- we removed these and left the date of birth empty

This was an iterative process of gathering stats, updating cleaning logic, and reviewing the output in our standard format to assess how to repeat with refined logic.

Reconciling Conflicts & Changes

We focused on assessing record conflicts based on the provided identification number only. If our existing list had a record with the same identification value, we checked the field changes (the "diff") to determine whether the change was acceptable using the following methodology:

if the age only changed by a year, we allowed the change as it's likely a reference date or rounding issue (the initial Ministry list was provided in a form that had an unfixed reference date of the current day and our prior list fixed that to January 5, 2024 per source dating)
if a comparison of names using Levenshtein Distance led to a change amounting to less than 30% of the original name's length, we allowed the change, but only if the new name didn't rely more on our fallback auto translation library than it did before
if an age or date of birth was not on the existing record and it was on the incoming one, we accepted it

This process helped us narrow in on specific record sets to refine our approach.

Where there were changes in names for existing records by identification ID within our accepted threshold of 30%, the breakdown was as follows:

change % upper bound	number of occurrences
0%	3,255
10%	4,506
20%	518
30%	56

(the change threshold upper bound means that 20% would include a 12% or 18% change to the original name)

In terms of overall types of record changes across those already in our list at the time of merge, the breakdown was as follows:

fields affected	number of occurences
Name	6,158
None (Duplicate)	4,089
Age and Name	2,113
Only Age	1,557
Age, Birth Date, and Name	10
Age and Birth Date	12
Birth Date and Name	1

West Bank Frequency Change

March 25, 2024 · 2 min read

The United Nations Office for the Coordination of Humanitarian Affairs (UN OCHA) announced today in their regular Flash Update that their reporting frequency would be changing from the cadence we've grown used to.

For the West Bank in particular, for which our dataset is affected, Flash Reports will only be issued on a weekly basis and the wider report on the region will be issued at most three times a week.

Going forward we will continue to update our daily datasets for Gaza and the West Bank at the same time, but we've now introduced a new field to the West Bank daily casualties dataset called flash_source. This field will be one of the following values:

un which indicates that the values for the given date were reported in a UN OCHA Flash Update for that date; or
fill which indicates that the values were from a UN OCHA Flash Update for a prior date

If a subsequent update includes specific information for a date we previously marked as fill that allows us to provide more accurate numbers, we will update previously reported days in the time series where more specific values are available. For the most part, "fill" values will just be the last reported values for the latest report date where flash_source is equal to un.

West Bank Daily & V3 Summary

February 22, 2024 · One min read

In a continuing effort to provide data that can be used to tell the Palestinian story since October 7th, we're adding a new Daily Casualties dataset for the West Bank. This dataset aims to complement our existing time series for Gaza so that they can be combined as needed.

We aim to update both datasets daily at the same time, and their latest values are now also reflected in a new V3 version of our Summary dataset. You can still access the existing V2 summary dataset and we will continue to update it as before, but we encourage upgrading.

Improved Name Translations

February 6, 2024 · 3 min read

We've made some significant changes to our previously published Killed in Gaza list, which has the names of those known to have been killed in Gaza since October 7th. This post provides more detail on our new methodology and what to expect about the changes.

Prior Method

Our prior list relied heavily on an existing library (arabic-names-to-en) which first tried to translate a name segment using a dictionary mapping, then fell back to a character-by-character lookup. We then had some volunteers do a visual review and incorporated manual changes. For a list of over 14 thousand names, this proved hard to manage.

New Method

We've since built our own dictionary mapping with more name coverage, and the process now looks like this:

we clean arabic names in the original list of formatting issues (using dict_ar_ar.csv)
we lookup / translate each name part into english (using dict_ar_en.csv)
we run final transformations when converting to JSON (see JSON export script)

The final step includes a fallback step to rely on the old library for remaining arabic translations that are not yet in our curated dict_ar_ar.csv. Currently there are less than 2% of the names partially handled by this fallback mechanism, and we'll be working to reduce that number.

Notable Changes

We've avoided what we believe would have been breaking changes to the dataset per our versioning guide, but we did add 21 new records from the original official list released in November 2023. The IDs that were introduced from that November list include:

401771530
401844790
405424524
407194836
411518053
425923364
436788202
437391725
438240293
438445371
441199296
800328817
802335927
803827518
804662112
804669000
901494161
930025457
932076094
942125832
95270068

The list before this change can be found on Github:

Here are some additional details about the current list & the latest revision:

there are 14,140 names
english name changes between this and the last published list, using Levenshtein distance:
- 24% of names had no change
- 60% of names had differences of between 1-4 edits, inclusive
- 15% of names had differences of between 5-9 edits, inclusive
- 1.9% of names had differences of 10 or more edits
92 records (0.65%) had age changes from the prior release (all 1 year less than before)
29 names have "unknown" for part or all of the name, and those are now represented in the english translation as ?

We're continually working to improve translations and the list in general. If you have ideas or want to contribute a change, please see our contributing guide.

Changes to Update Methodology​

Change Summary​

Change Summary​

Demographics of Our List Before Merge​

Demographics of Newly Added Records​

Demographics of Our Updated List After Merge​

Demographics of Removed Records​

Merge Methodology / Commentary​

Change Summary​

Demographics of Our List Before Merge​

Demographics of Newly Added Records​

Demographics of Our Updated List After Merge​

Demographics of Removed Records​

Merge Methodology / Commentary​

Cleaning the Data​

Reconciling Conflicts & Changes​

Prior Method​

New Method​

Notable Changes​

Changes to Update Methodology

Change Summary

Change Summary

Demographics of Our List Before Merge

Demographics of Newly Added Records

Demographics of Our Updated List After Merge

Demographics of Removed Records

Merge Methodology / Commentary

Change Summary

Demographics of Our List Before Merge

Demographics of Newly Added Records

Demographics of Our Updated List After Merge

Demographics of Removed Records

Merge Methodology / Commentary

Cleaning the Data

Reconciling Conflicts & Changes

Prior Method

New Method

Notable Changes