A CORS proxy is a service that allows sites to access resources from other websites, without having to own that website in case when no Access-Control-Allow-Origin header is present on the requested resource.
There are multiple implementations of CORS proxies; but for me, it was interesting to create a proxy without a single line of code.
I came up with two simple solutions.
Requested Resource is Passed as a Part of the Request-URI
The configuration below will handle requests such as https://cors-proxy.example.org/https://google.com/
server {
listen 443 ssl;
server_name cors-proxy.example.org;
ssl_stapling on;
ssl_stapling_verify on;
ssl_certificate /etc/letsencrypt/live/example.org/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.org/privkey.pem;
ssl_trusted_certificate /etc/letsencrypt/live/example.org/chain.pem;
merge_slashes off;
resolver 127.0.0.1;
location ~ ^/(?<pscheme>https?://)(?<phost>[^/]+)(?<ppath>/.*)$ {
if ($http_origin !~ 'https?://[^.]+\.[^/]+') {
return 400;
}
proxy_http_version 1.1;
proxy_set_header Host $phost;
proxy_redirect ~^(https?://)([^/]+)(.*)$ $scheme://$http_host/$1$2$3;
proxy_pass $pscheme$phost$ppath$is_args$args;
add_header Access-Control-Allow-Origin * always;
}
}
The very first lines configure the name of the server (cors-proxy.example.org) and the SSL configuration; you will need to adjust them to match your server name and paths to the SSL key and certificate.
The magic begins with merge_slashes off line.
If you access https://cors-proxy.example.org/https://google.com/, nginx by default will merge consecutive slashes, and the actual it will use location will be in this case /https:/google.com/. merge_slashes off directive turns off this behavior.
Next, we will need a DNS server (I have a DNS resolver on 127.0.0.1:53; you may use the one provided by your hosting company or ISP, but in this case do not forget to update the value of <a href="http://nginx.org/en/docs/http/ngx_http_core_module.html#resolver">resolver</a> directive).
The location directive parses the address into the scheme (pscheme), host (phost), and path (ppath) parts (note: I tried not to use named variables inside the regular expression, but for some reason this did not play nice with if clauses: I got “invalid URL prefix” error in nginx’s error log), and also makes sure that the requested address begins with either http:// or https://.
The CORS specification requires all CORS requests to include an Origin header in every request (this also prevents the use of the proxy for casual browsing); this is what if ($http_origin !~ 'https?://[^.]+\.[^/]+') line does.
proxy_redirect directive is another piece of magic: it makes sure that all redirects are served via our CORS proxy by rewriting scheme://host/request into https://cors-proxy.example.org/scheme://host/request.
Finally, add_header Access-Control-Allow-Origin * always adds Access-Control-Allow-Origin: * header to all responses.
Requested Resource is Passed in the Query String
The configuration below will handle requests such as https://cors-proxy.example.org/?https://google.com/.
server {
listen 443 ssl;
server_name cors-proxy.example.org;
ssl_stapling on;
ssl_stapling_verify on;
ssl_certificate /etc/letsencrypt/live/example.org/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.org/privkey.pem;
ssl_trusted_certificate /etc/letsencrypt/live/example.org/chain.pem;
resolver 127.0.0.1;
location = / {
if ($http_origin !~ 'https?://[^.]+\.[^/]+') {
return 400;
}
set $phost "";
if ($args ~ '^https?://([^/]+)') {
set $phost $1;
}
if ($phost = "") {
return 400;
}
proxy_http_version 1.1;
proxy_set_header Host $phost;
proxy_redirect ~^(https?://)([^/]+)(.*)$ $scheme://$http_host/?$1$2$3;
proxy_pass $args;
add_header Access-Control-Allow-Origin * always;
}
}
This configuration looks pretty much the same as the previous one. The key differences are:
- We need to extract the host to connect to from the query string. This is performed in
if ($args ~ '^https?://([^/]+)'). If we failed to find a host name, we issue a400 Bad Requestresponse (if ($phost = "")). - We pass
$argsas the argument toproxy_pass, because the entire query string is the URL of the requested resource. We also validated that it begins withhttp://orhttps://. proxy_redirectupdated to include?
Of course, these configurations have a lot of room for improvement, but they still can serve as a starting point for a curious reader.