Skip to main content

How to skip invalid characters from an UTF-8 XML file or string in PHP

Yesterday I wrote something about stripping out P7M data from a XML P7M file or string, as long as it was encoded using CAdES format. It was quite ugly, yet it does the job.

Today I will raise the ugly-but-working bar even further by publishing the method I wrote as follow-up, which basically strips/skips all the invalid characters from the resulting XML string, so it can be cast into a SimpleXML PHP class:



PHP – How to strip P7M data from a XML.P7M file or string (CAdES, FatturaPA)

Some days ago I had to write some PHP code to extract the contents from some XML files (italian electronic invoices for Public Administrations, also known in Italy as FatturaPA): the work was pretty simple, yet I had to quickly solve two main problems: extracting XML data from a digitally signed .xml.p7m file and stripping away some invalid UTF8 characters in the XML content itself.

Since I had to get the job done quickly, I’ve dealt with both tasks using quick’n’dirty workarounds by fully taking advantage of the famous PHP “double clawed hammer” features: we’ll be dealing with the first one here, while the latter has been addressed in another dedicated post.

Credits to Ian Baker for this awesome handmade job. Look at his project on Flickr here:

Regarding the P7M thing I have been lucky, since all the invoices were digitally signed using CAdES format, which – as you might already know – works by adding a PKCS#7 header and a signature info footer to the original file, meaning that we can easily get rid of them – as long as we don’t need to check the signature. It’s worth noting here that, as it was perfectly fine for my specific scenario – since everything was already verified – it could not be the case for most situations where you do want to check the signature before reading/using the file.



How to URLEncode in UTF8 with Visual Basic 6

Visual Basic 6 (VB6) is like a mythical creature: no matter how hard you try to get rid of it, it always comes back haunting you sooner or later: most of the times it assumes the form of some old and outdated home-made software you really would like to dismiss, yet you still have it somewhere. Today we had another resurrection in our office involving a very old client-based software that we still use here and there.

The issue we had to face was related to the following URLEncode function hidden within the source code:



Get a File Content-Type / MIME-type from file extension in ASP.NET C#

How can I get the MIME type from a file extension in C#? This is a rather common question among developers, an evergreen requirement that I happen to heard at least once a year from friends & colleagues working with ASP.NET MVC,ASP.NET Web API and (lately) .NET Core. The reason is pretty much obvious: whenever you end up working with file object storage in any web-based or client-based application, you will sooner or later have to retrieve the MIME type related to the byte array you’re dealing with.

There are a number of ways/techniques to do that, but – for the sake of simplicity – we will put them down to two: looking them up within the Windows Registry or relying to static, hard-coded MIME type lists. We won’t consider anything that involves querying an external service, as we do want an efficient way to deal with such issue.



ASP.NET – CSS Media Queries in Razor Pages – How to embed @media syntax

If you’re working with ASP.NET MVC or ASP.NET Core using Razor pages and you want to put a CSS style within the page (CSS embed), you might stumble upon one of these following errors:

CS0103: The name ‘media’ does not exist in the current context.

CS0103: The name ‘if’ does not exist in the current context.

… And so on.

When something like that occurs, it probably means that you’re using a CSS3 media query (or other CSS3 query related commands) such as this: