# Action Guide: PL11089 Showing PL689 Content in PDF

## 🔴 Current Situation

- ✅ SQL query matching is **FIXED** (PL11089 queries no longer match PL689 in database)
- ❌ PDF conversion is **STILL WRONG** (PL11089 PDF shows PL689 content)

This means the **database has incorrect associations** between document numbers and actual files.

## 🚀 Immediate Actions (Run These Now)

### Step 1: Run Diagnostics (5 minutes)

```bash
cd /home/plagis/workspace/plagis_aumentum

# This shows what files are associated with PL11089 vs PL689 in the database
python3 diagnose_image_associations.py
```

**What to look for**:
- Are the content_url paths different for PL11089 vs PL689?
- Do the file timestamps match the document dates?
- Are there any cross-contamination warnings?

---

### Step 2: Clear Cache and Test Fresh (5 minutes)

```bash
cd /home/plagis/workspace/plagis_aumentum

# This clears cached PDFs and generates a fresh one
./clear_cache_and_test.sh
```

**What this does**:
1. Removes all cached PL11089 and PL689 PDFs
2. Queries database for fresh associations
3. Generates new PDF from scratch
4. Saves to `/tmp/test_PL11089_fresh.pdf`

---

### Step 3: Manual Verification (2 minutes)

```bash
# Open the freshly generated PDF
xdg-open /tmp/test_PL11089_fresh.pdf

# Or on Mac:
# open /tmp/test_PL11089_fresh.pdf
```

**Check**:
- Does the PDF show **PL11089** content? ✅ Problem solved (was cache issue)
- Does the PDF show **PL689** content? ❌ Database has wrong associations

---

## 📊 Interpretation Guide

### Result A: PDF Now Shows Correct Content (PL11089)

**Cause**: Cache was serving old wrong PDF

**Status**: ✅ **FIXED**

**Action**: None needed. Cache will serve fresh PDFs going forward.

---

### Result B: PDF Still Shows Wrong Content (PL689)

**Cause**: Database has incorrect file associations

**Status**: ❌ **DATABASE CORRECTION NEEDED**

**Evidence**: The diagnostic script likely showed that `alf_node_properties` has:
```
string_value = "PL11089"
content_url = "store://path/to/PL689_actual_file.bin"  ← Wrong!
```

**Next Steps**: See section "Fixing Database Associations" below

---

## 🔧 Fixing Database Associations

If the diagnostic shows database association problems, here are your options:

### Option 1: Query Database to Find Wrong Associations

```bash
# Connect to your database and run:
cd /home/plagis/workspace/plagis_aumentum
```

Create a SQL query file:
```sql
-- check_associations.sql
-- Find what files are linked to PL11089
SELECT 
    np.node_id,
    np.string_value AS document_number,
    cu.content_url,
    n.audit_created AS node_created_date,
    cd.content_url_id
FROM LRSAdmin.alf_node_properties np
JOIN LRSAdmin.alf_qname q ON q.id = np.qname_id
JOIN LRSAdmin.alf_node n ON n.id = np.node_id
LEFT JOIN LRSAdmin.alf_content_data cd ON cd.id = n.id
LEFT JOIN LRSAdmin.alf_content_url cu ON cu.id = cd.content_url_id
WHERE RTRIM(LTRIM(np.string_value)) COLLATE Latin1_General_BIN = 'PL11089'
AND q.local_name IN ('targetRids','sourceRids');

-- Find what files are linked to PL689
SELECT 
    np.node_id,
    np.string_value AS document_number,
    cu.content_url,
    n.audit_created AS node_created_date,
    cd.content_url_id
FROM LRSAdmin.alf_node_properties np
JOIN LRSAdmin.alf_qname q ON q.id = np.qname_id
JOIN LRSAdmin.alf_node n ON n.id = np.node_id
LEFT JOIN LRSAdmin.alf_content_data cd ON cd.id = n.id
LEFT JOIN LRSAdmin.alf_content_url cu ON cu.id = cd.content_url_id
WHERE RTRIM(LTRIM(np.string_value)) COLLATE Latin1_General_BIN = 'PL689'
AND q.local_name IN ('targetRids','sourceRids');

-- Compare the creation dates with document acceptance dates
SELECT id, document_number, acceptance, recordation
FROM LRSAdmin.lr_source_document
WHERE document_number IN ('PL11089', 'PL689');
```

**Compare**:
- If node_created_date for "PL11089" matches PL689's acceptance date → Wrong association!
- If content_url paths look suspicious → Wrong association!

---

### Option 2: Create Workaround with Exclusion List

If you can't modify the database immediately, create a workaround:

```python
# Add to aumentum_browser_service.py after line 806

# Known incorrect associations (temporary workaround)
WRONG_ASSOCIATIONS = {
    'PL11089': [
        'store://2014/11/3/14/22/...',  # Add actual wrong URLs from diagnostic
    ],
}

def resolve_store_urls_by_document_number(self, document_number: str) -> List[Dict]:
    # ... existing code ...
    
    # After line 863 (after getting db_images), add:
    if document_number in WRONG_ASSOCIATIONS:
        wrong_urls = WRONG_ASSOCIATIONS[document_number]
        db_images = [img for img in db_images 
                     if img['content_url'] not in wrong_urls]
        print(f"   ⚠️ Filtered out {len(wrong_urls)} known wrong associations")
```

---

### Option 3: Database Correction (Permanent Fix)

⚠️ **WARNING**: Only do this if you have database admin access and have verified the correct associations!

```sql
-- BACKUP FIRST!
SELECT * INTO LRSAdmin.alf_node_properties_backup_20251103
FROM LRSAdmin.alf_node_properties;

-- Update wrong association (replace <node_id> with actual value from diagnostic)
UPDATE LRSAdmin.alf_node_properties
SET string_value = 'PL689'  -- Correct document number
WHERE node_id = <wrong_node_id>
AND qname_id IN (
    SELECT id FROM LRSAdmin.alf_qname 
    WHERE local_name IN ('targetRids','sourceRids')
);

-- Verify the fix
SELECT np.node_id, np.string_value, cu.content_url
FROM LRSAdmin.alf_node_properties np
JOIN LRSAdmin.alf_qname q ON q.id = np.qname_id
JOIN LRSAdmin.alf_node n ON n.id = np.node_id
LEFT JOIN LRSAdmin.alf_content_data cd ON cd.id = n.id
LEFT JOIN LRSAdmin.alf_content_url cu ON cu.id = cd.content_url_id
WHERE np.node_id = <node_id>
AND q.local_name IN ('targetRids','sourceRids');
```

---

## 📋 Quick Checklist

- [ ] Run `diagnose_image_associations.py` 
- [ ] Run `clear_cache_and_test.sh`
- [ ] Open `/tmp/test_PL11089_fresh.pdf`
- [ ] Verify if PDF shows correct content
- [ ] If still wrong, identify specific wrong node IDs from diagnostic
- [ ] Choose: Database correction OR temporary workaround
- [ ] Test again after fix

---

## 🎯 Expected Timeline

| Task | Time | Status |
|------|------|--------|
| Run diagnostics | 5 min | ⏳ Pending |
| Clear cache & test | 5 min | ⏳ Pending |
| Verify PDF content | 2 min | ⏳ Pending |
| **If cache issue** | - | - |
| └─ Done! | 0 min | ✅ Fixed |
| **If database issue** | - | - |
| └─ Identify wrong nodes | 10 min | ⏳ Pending |
| └─ Implement workaround | 15 min | ⏳ Pending |
| └─ OR Database correction | 30 min | ⏳ Pending |
| └─ Test and verify | 5 min | ⏳ Pending |

---

## 💡 Quick Commands Reference

```bash
# Diagnostic
cd /home/plagis/workspace/plagis_aumentum
python3 diagnose_image_associations.py

# Clear cache and test
./clear_cache_and_test.sh

# View generated PDF
xdg-open /tmp/test_PL11089_fresh.pdf

# Clear specific cache via API
curl -X DELETE "http://localhost:8001/documents/PL11089/cache"

# Regenerate specific PDF
curl "http://localhost:8001/documents/pdf-by-document-number?document_number=PL11089&document_id=<doc_id>" \
  -o test.pdf
```

---

## 🆘 Support

If you need help interpreting the diagnostic output or implementing the fix:

1. **Share diagnostic output**: Copy the output from `diagnose_image_associations.py`
2. **Share PDF verification result**: After opening the test PDF, confirm what content it shows
3. **Provide context**: When were these documents scanned? Have they been corrected before?

---

**START HERE**: 
```bash
cd /home/plagis/workspace/plagis_aumentum && python3 diagnose_image_associations.py
```

