Does Express.js respect RFC-3986 for query string?

Issue

Does ExpressJs respect/use the RFC-3986 standard when decoding query string parameters?
Why the direct char "è" is accepted but the encoded version "%E8" isn’t?

Test Expressjs http server

'use strict';

const express = require('express');
const bodyParser = require('body-parser');

// parse application/x-www-form-urlencoded
app.use(bodyParser.urlencoded({ extended: false }));

app.get('/test?', (req, res, next) => {
  console.log(req.query);
  res.status(200);
});

app.listen(4567, '127.0.0.1', () => {
    console.log('test http server started');
});

Request

GET localhost:4567/test?message=lorem+ipsum%2C%20foo+%E8+bar

Expected log

{ message: 'lorem ipsum, foo è bar' }

Server logs

{ message: 'lorem+ipsum%2C%20foo+%E8+bar' }

If we remove the %E8 char "è"

Request

GET localhost:4567/test?message=lorem+ipsum%2C%20foo+bar

Server logs

{ message: 'lorem ipsum, foo bar' }

Here (https://www.url-encode-decode.com/) I can read that for URI it can be applied the RFC-3986 which doesn’t allow chars like è, é, à…

So it seems that Express refuse those chars, but if we try

Request

GET localhost:4567/test?message=lorem+ipsum%2C%20foo+è+bar

Expected log

{ message: 'lorem+ipsum%2C%20foo+è+bar' }

Server logs

{ message: 'lorem ipsum, foo è bar' }

So the direct char "è" is accepted but the encoded version %E8 isn’t?

I’ve tried to read ExpressJS sources but I can’t find out a response.

Solution

Basically self solved:

First thing first is that i found that in UTF-8 the hex of ‘è’ is ‘C3A8’ not ‘E8’.

So Express is probably accepting all UTF-8 chars, without applying RFC-3986 standard. This will explain why ‘E8’ isn’t accepted but direct char ‘è’ is. ‘E8’ isn’t accepted beceause it doesn’t match anything in UTF-8.

Answered By – Andrea Franchini

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published