Troubleshooting corrupted first 16 bytes in AES CFB mode
I am writing an implementation of snmpwalk with authentication and encryption. Everything works (and I can see the conversations decrypted in Wireshark in-between the client and server), however in the AES decryption, sometimes the first 16 bytes of the decrypted data are corrupted.
like this:
You can see the first 16 bytes do not make sense, however, after that the bytes look like normal.
Attached is the full log. Could you help debug this? Please first verify your logical conclusions by posing a theory and confirming it based on the log file that I have uploaded, only after my confirmation that I agree you can show the fixes.
EditLooking at your log file, I can see the issue with AES decryption where the first 16 bytes are corrupted. Let me analyze this step by step.
The problem appears to be with the AES CFB (Cipher Feedback) mode implementation. Looking at your code, I see you're using AES-128 in CFB mode for privacy, but there's a mismatch in how the IV is being calculated and used between encryption and decryption.
Here are the key observations from your logs:
My theory is: The issue is that you're generating your own salt for encryption ([00, 00, 00, 00, 00, 00, 00, 01]) but the server is responding with a different salt ([11, 57, b1, 11, 81, 13, 88, 71]). When you try to decrypt the server's response using your client-generated IV instead of the server's IV, the first block decrypts incorrectly.
Root cause: In AES CFB mode, the IV must match exactly between encryption and decryption. The server is using a different salt/IV than what your client is calculating for decryption.
Questions to confirm my theory:
Please confirm if this analysis matches what you're seeing, and then I can show you the specific fixes needed.
I have another log, with a successful run - without any code changes! should I upload it, so you can compare?
EditYes, absolutely! Please upload the successful run log. Comparing a successful run with the failed run will be extremely valuable for understanding what's different between the working and non-working cases.
This will help me:
Please upload the successful log - this comparison will likely make the issue much clearer!
Here it goes. The command line to request the data is exactly the same as in the non-working case.
EditPerfect! Now I can see the key difference. Looking at both logs, I've identified the exact issue:
The problem is NOT with your encryption/decryption logic - that's working correctly. The issue is with salt reuse and the privacy counter not being properly incremented.
Key Evidence from the logs:
Root Cause:
The Fix: You need to ensure the privacy counter increments properly across requests within the same session. The issue is in your UsmConfig::next_priv_counter() method being called on a fresh config each time.
Do you confirm this analysis matches what you're seeing? The key insight is that engine time changes between some requests but not others, and your privacy counter resets instead of incrementing properly.
so, it looks like the response when it fails has the engine time incrementing, which causes the IV to be calculated wrongly?
EditExactly! You've identified the core issue perfectly.
What's happening:
During decryption, you use the server's salt but YOUR engine time from discovery, creating a mismatched IV that corrupts the first 16 bytes in CFB mode.
The fix: You need to update your local engine time before each request and use the CURRENT engine time (not the discovery-time engine time) when calculating IVs for decryption.
Do you want me to show you the specific code changes needed?
yeah, please do!
EditHere are the specific code changes needed:
In your snmpwalk.rs
, modify the send_getnext_request
and send_getbulk_request
methods to update the engine time:
Add this method to your SnmpWalker
implementation:
In your snmp.rs
, modify the UsmConfig
to properly track the privacy counter:
Most importantly, in your decrypt_scoped_pdu
method in snmpwalk.rs
, extract and use the server's engine time from the response:
Modify your extract_bindings_from_snmpv3
method to extract and pass the server's engine time:
The key insight is that the server uses its current engine time for encryption, but you were using your cached discovery-time engine time for decryption. This mismatch corrupts the first 16 bytes in AES CFB mode because the IV doesn't match.