How to remove ‘active contents’ from PDF file on Linux / Strip Active Contents from PDF

how-to-remove-active-content-from-pdf-with-ghoscript-on-gnu-linux.svg

I'm updating my Autiobography (CV) with my latest job eployeers, technology expertise and certifications and usually use the EuroPassCV standard web service to update already generated PDF files.The service as web based application service allows easy edit from the web as most web services which is quite handy and then allows Export to DOCX or PDF file format. So far so good but today I faced a really weird problem after, I've used successfully EuroPassCV service  and downloaded the PDF to my computer and tried to submit my Curriculum Vitae application to SAP's Successfactor newly created account for the purpose I faced a weird I error saying

"The system does not allow files with Active contents. Please …"

the-system-does-not-allow-files-with-active-contents-pdf-error-successfactors-errors

Of course if this error message was received on a Start-up application on Application upload that would be fine, but come on this is SAP's Successfactors, it cannot accept a standard generated PDF from EuroPass which nowadays is a standard for CV here in Europe and hosted on of official European Union website europa.eu

To me this is a clear signal SAP needs an experienced ICT specialists and Quality Assurance testers like me to fix their mess and I will be willing to help them if they contact me until its too late for them, but let me go back to the topic of this article which was how to remove active contents from a PDF file 🙂

So first lets make clear what is Active content in a file ?

Active contents is content that includes programs like Internet polls, JavaScript applications, stock tickers, animated images, ActiveX applications, action items, streaming video and audio, weather maps, embedded objects, and much more. Active content contains programs that trigger automatic actions on a Web page without the user's knowledge or consent.
Active contents (Macros) could exist in many file formats that are used daily in most companies / organizations daily, active content can be contained in documents such as MS Excel,  Word, PDF, PowerPoint and so on.

So why does some applications disable document support for Active contents?

Well just for the reason of security, Active contents could often be some kind of malware or crapware and they can mess up with the web application (in case of bugs) or even mess up with server software if it is a complex warm like behavior exploiting some kind of vulnerability.
One thing to say about active contents removal on file upload by applications is that this practice could only be tolerated if the organization had already adapted a security through obscurity which most likely is the case with SAP's Successfactors and many other applications out there.

So next question is how to  Panicea (Resolution) Active Contents existing in a PDF file

Assuming you have a GNU / Linux Desktop or server with ghostscript package installed (which is the case by default with virtually any modern Linux distribution), removing Active Contents from PDF to make possible file to be submitted to the picky Security Conscious application with a single command:
 

gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=CV-Georgi_Dimitrov_Georgiev-Europass-20190718-EN-noact-content.pdf -dBATCH CV-Georgi_Dimitrov_Georgiev-Europass-20190718-EN.pdf


After that the stripped active contents PDF file would succeed in uploading to web app.
 

 

 

Share this on:

More helpful Articles

Download PDFDownload PDF

Leave a Reply

CommentLuv badge