-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
We have an issue at the moment on the harvest item v-for loop due to duplicate keys.
<tr
v-for="item in paginatedItems"
:key="item.remote_id"
>
Indeed, item.remote_id is not unique due to "skipped" items that have remote_id at null and duplicates remote id.
I checked for unique property but we don't have an id and it seems that started, created and ended are not 100% unique either.
Pipeline to ckech for uniqueness
from udata.harvest.models import HarvestJob, HarvestSource
pipeline = [
# Étape 1 : On décompose chaque HarvestJob en un document par HarvestItem
{
"$unwind": "$items"
},
# Étape 2 : On groupe par HarvestJob et par date created des items
{
"$group": {
"_id": {
"job_id": "$_id",
"created_date": "$items.started"
},
"count": {"$sum": 1},
"job": {"$first": "$$ROOT"}
}
},
# Étape 3 : On filtre pour ne garder que les groupes où count > 1 (plusieurs items avec la même date)
{
"$match": {
"count": {"$gt": 1}
}
},
# Étape 4 : On regroupe par job_id pour éviter les doublons
{
"$group": {
"_id": "$_id.job_id",
"job": {"$first": "$job"},
"duplicate_dates": {"$push": "$_id.created_date"}
}
},
# Étape 5 : On reforme le document pour un affichage clair
{
"$project": {
"job": 1,
"duplicate_dates": 1
}
}
]
results = list(HarvestJob.objects(created__gte="2025-11-25").aggregate(*pipeline))Metadata
Metadata
Assignees
Labels
No labels