Update proposal 166

This commit is contained in:
eyedeekay
2024-08-27 22:13:59 -04:00
parent 373fcc20e4
commit fe89405ca4

View File

@ -5,9 +5,9 @@ I2P proposal #166: Identity/Host Aware Tunnel Types
:author: eyedeekay
:created: 2024-05-27
:thread: http://i2pforum.i2p/viewforum.php?f=13
:lastupdated: 2024-05-27
:lastupdated: 2024-08-27
:status: Open
:target: 0.9.62
:target: 0.9.65
.. contents::
@ -17,8 +17,9 @@ Proposal for a Host-Aware HTTP Proxy Tunnel Type
This is a proposal to resolve the “Shared Identity Problem” in
conventional HTTP-over-I2P usage by introducing a new HTTP proxy tunnel
type. This tunnel type has supplemental behavior which is intended to
prevent or limit the utility of tracking conducted by server operators,
against user-agents(browsers) and the I2P Client Application itself.
prevent or limit the utility of tracking conducted by potential hostile
hidden service operators, against targeted user-agents(browsers) and the
I2P Client Application itself.
What is the “Shared Identity” problem?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -26,8 +27,9 @@ What is the “Shared Identity” problem?
The “Shared Identity” problem occurs when a user-agent on a
cryptographically addressed overlay network shares a cryptographic
identity with another user-agent. This occurs, for instance, when a
Firefox and GNU Wget are both configured to use the same HTTP Proxy. In
this scenario, it is possible for the server to collect and store the
Firefox and GNU Wget are both configured to use the same HTTP Proxy.
In this scenario, it is possible for the server to collect and store the
cryptographic address(Destination) used to reply to the activity. It can
treat this as a “Fingerprint” which is always 100% unique, because it is
cryptographic in origin. This means that the linkability observed by the
@ -44,15 +46,37 @@ with the deleted comments accessible courtesy of
`pullpush.io <https://api.pullpush.io/reddit/search/comment/?link_id=579idi>`__.
*At the time* I was one of the most active respondents, and *at the
time* I believed the issue was small. In the past 8 years, the situation
and my opinion of it have changed, with the emergence of Mastodon and
Matrix servers inside of I2P, the threat posed by malicious destination
correlation grows considerably as these sites are in a position to
“profile” specific users. `An example implementation of the Shared
Identity attack on HTTP
User-Agents <https://github.com/eyedeekay/colluding_sites_attack/>`__
and my opinion of it have changed, I now believe the threat posed by
malicious destination correlation grows considerably as more sites are
in a position to “profile” specific users.
The Shared Identity is not useful against a user who is using I2P to
obfuscate geolocation. It also cannot be used to break I2Ps routing.
This attack has a very low barrier to entry. It only requires that a
hidden service operator operate multiple services. For attacks on
contemporary visits(visiting multiple sites at the same time), this is
the only requirement. For non-contemporary linking, one of those
services must be a service which hosts “accounts” which belong to a
single user who is targeted for tracking.
Currently, any service operator who hosts user accounts will be able to
correlate them with activity across any sites they control by exploiting
the Shared Identity problem. Mastodon, Gitlab, or even simple forums
could be attackers in disguise as long as they operate more than one
service and have an interest in creating a profile for a user. This
surveillance could be conducted for stalking, financial gain, or
intelligence-related reasons. Right now there are dozens of major
operators, who could carry out this attack and gain meaningful data from
it. We mostly trust them not to for now, but players who dont care
about our opinions could easily emerge.
This is directly related to a fairly basic form of profile-building on
the clear web where organizations can correlate interactions on their
site with interations on networks they control. On I2P, because the
cryptographic destination is unique, this technique can sometimes be
even more reliable, albeit without the additional power of geolocation.
The Shared Identity is not useful against a user who is using I2P solely
to obfuscate geolocation. It also cannot be used to break I2Ps routing.
It is only a problem of contextual identity management.
- It is impossible to use the Shared Identity problem to geolocate an
I2P user.
@ -69,6 +93,15 @@ which supports “Tabbed” operation.
third-party resources.
- Disabling Javascript accomplishes **nothing** against the Shared
Identity problem.
- If a link can be established between non-contemporary sessions such
as by “traditional” browser fingerprinting, then the Shared Identity
can be applied transitively, potentially enabling a non-contemporary
linking strategy.
- If a link can be established between a clearnet activity and an I2P
identity, for instance, if the target is logged into a site with both
an I2P and a clearnet presence on both sides, the Shared Identity can
be applied transitively, potentially enabling complete
de-anonymization.
How you view the severity of the Shared Identity problem as it applies
to the I2P HTTP proxy depends on where you(or more to the point, a
@ -81,30 +114,10 @@ identity” for the application lies. There are several possibilities:
how it works when an application uses an API like SAMv3 or I2CP,
where an application creates its identity and controls its
lifetime.
3. HTTP is the Application, but the Contextual Identity is controlled
with the “Authentication Hack” - Interesting possibility detailed at
the end of this proposal, not the object of this proposal
4. HTTP is the Application, but the Host is the Contextual Identity
3. HTTP is the Application, but the Host is the Contextual Identity
-This is the object of this proposal, which treats each Host as a
potential “Web Application” and treats the threat surface as such.
It also depends on who you think your attackers are and what you would
like to prevent. Someone in a position to carry out this attack would be
a person in a position to have multiple sites “collude” in order to
collect the destinations of I2P Clients, in order to correlate activity
on one site with activity on another. This is a fairly basic form of
profile-building on the clear web where organizations can correlate
interactions on their site with interations on networks they control. On
I2P, because the cryptographic destination is unique, this technique can
sometimes be even more reliable, albeit without the additional power of
geolocation. Any service which hosts user accounts would be able to
correlate them with activity across any sites they control using the
Shared Identity problem. Mastodon, Gitlab, or even simple Forums could
be attackers in disguise as long as they operate more than one service
and have an interest in creating a profile for a user. This surveillance
could be conducted for stalking, financial gain, or intelligence-related
reasons.
Is it Solvable?
^^^^^^^^^^^^^^^
@ -114,9 +127,10 @@ anonymity of an application. However, it is possible to build a proxy
which intelligently responds to a specific application which behaves in
a predictable way. For instance, in modern Web Browsers, it is expected
that users will have multiple tabs open, where they will be interacting
with multiple web sites, which will be distinguished by hostname. This
allows us to improve upon the behavior of the HTTP Proxy for this type
of HTTP user-agent by making the behavior of the proxy match the
with multiple web sites, which will be distinguished by hostname.
This allows us to improve upon the behavior of the HTTP Proxy for this
type of HTTP user-agent by making the behavior of the proxy match the
behavior of the user-agent by giving each host its own Destination when
used with the HTTP Proxy. This change makes it impossible to use the
Shared Identity problem to derive a fingerprint which can be used to
@ -128,20 +142,21 @@ Description:
A new HTTP Proxy will be created and added to Hidden Services
Manager(I2PTunnel). The new HTTP Proxy will operate as a “multiplexer”
of HTTP Proxies. The multiplexer itself has no destination. Each
individual HTTP Proxy which becomes part of the multiplex has its own
local destination, random local port, and its own tunnel pool. HTTP
proxies are created on-demand by the multiplexer, where the “demand” is
of I2P Sockets. The multiplexer itself has no destination. Each
individual I2P Socket which becomes part of the multiplex has its own
local destination, random local port, and its own tunnel pool. I2P
Sockets are created on-demand by the multiplexer, where the “demand” is
the first visit to the new host. It is possible to optimize the creation
of the HTTP proxies before inserting them into the multiplexer by
creating one or more in advance and storing them outside the multiplexer
of the I2P Sockets before inserting them into the multiplexer by
creating one or more in advance and storing them outside the
multiplexer. This may improve performance.
An additional HTTP proxy, with its own destination, is set up as the
An additional I2P Socket, with its own destination, is set up as the
carrier of an “Outproxy” for any site which does *not* have an I2P
Destination, for example any Clearnet site. This effectively makes all
Outproxy usage a single Contextual Identity, with the caveat that
configuring multiple Outproxies for the tunnel will cause the normal
"Sticky" outproxy rotation, where each outproxy only gets requests for a
Sticky outproxy rotation, where each outproxy only gets requests for a
single site. This is *almost* the equivalent behavior as isolating
HTTP-over-I2P proxies by destination, on the clear internet.
@ -151,9 +166,8 @@ Resource Considerations:
The new HTTP proxy requires additional resources compared to the
existing HTTP proxy. It will:
- Potentially build more tunnels
- Potentially build more tunnels and I2PSockets
- Build tunnels more often
- Occupy more ports
Each of these requires:
@ -168,11 +182,11 @@ proxy should be configured to use as little as possible. Proxies which
are part of the multiplexer(not the parent proxy) should be configured
to:
- Multiplexed I2PTunnels build 1 tunnel in, 1 tunnel out in their
- Multiplexed I2PSockets build 1 tunnel in, 1 tunnel out in their
tunnel pools
- Multiplexed I2PTunnels take 3 hops by default.
- Close tunnels after 10 minutes of inactivity
- I2PTunnels started by the Multiplexer share the lifespan of the
- Multiplexed I2PSockets take 3 hops by default.
- Close sockets after 10 minutes of inactivity
- I2PSockets started by the Multiplexer share the lifespan of the
Multiplexer. Multiplexed tunnels are not “Destructed” until the
parent Multiplexer is.
@ -185,87 +199,78 @@ section. As you can see, the HTTP proxy interacts with I2P sites
directly using only one destination. In this scenario, HTTP is both the
application and the contextual identity.
.. code::
.. code:: md
**Current Situation: HTTP is the Application, HTTP is the Contextual Identity**
__-> Outproxy <-> i2pgit.org
/
Browser <-> HTTP Proxy(one Destination) <---> idk.i2p
\__-> translate.idk.i2p
\__-> git.idk.i2p
__-> Outproxy <-> i2pgit.org
/
Browser <-> HTTP Proxy(one Destination)<->I2P Socket <---> idk.i2p
\__-> translate.idk.i2p
\__-> git.idk.i2p
The diagram below represents the operation of a host-aware HTTP proxy,
which corresponds to “Possibility 4.” under the “Is it a problem”
which corresponds to “Possibility 3.” under the “Is it a problem”
section. In this secenario, HTTP is the application, but the Host
defines the contextual identity, wherein each I2P site interacts with a
different HTTP proxy with a unique destination per-host. This prevents
operators of multiple sites from being able to distinguish when the same
person is visiting multiple sites which they operate.
.. code::
.. code:: md
**After the Change: HTTP is the Application, Host is the Contextual Identity**
__-> HTTP Proxy(Destination A - Outproxies Only) <--> i2pgit.org
__-> I2P Socket(Destination A - Outproxies Only) <--> i2pgit.org
/
Browser <-> HTTP Proxy Multiplexer(No Destination) <---> HTTP Proxy(Destination B) <--> idk.i2p
\__-> HTTP Proxy(Destination C) <--> translate.idk.i2p
\__-> HTTP Proxy(Destination C) <--> git.idk.i2p
Browser <-> HTTP Proxy Multiplexer(No Destination) <---> I2P Socket(Destination B) <--> idk.i2p
\__-> I2P Socket(Destination C) <--> translate.idk.i2p
\__-> I2P Socket(Destination C) <--> git.idk.i2p
Status:
^^^^^^^
A working Java implementation of the host-aware proxy which conforms to
this proposal is available at idk's fork under the branch:
i2p.i2p.2.6.0-browser-proxy-post-keepalive Link in citations.
an older version of this proposal is available at idk's fork under the
branch: i2p.i2p.2.6.0-browser-proxy-post-keepalive Link in citations. It
is under heavy revision, in order to break down the changes into smaller
sections.
Implementations with varying capabilities have been written in Go using
the SAMv3 library, they may be useful for embedding in other Go
applications of for go-i2p but are unsuitable for Java I2P.
applications or for go-i2p but are unsuitable for Java I2P.
Additionally, they lack good support for working interactively with
encrypted leaseSets.
Addendum: SOCKS
Addendum: ``i2psocks``
A similar shared identity problem exists in the SOCKS proxy as well.
However, there, it is harder to solve in part due to the reasons
described on the “SOCKS Tips” page on the I2P site. In particular, it
requires much more effort to determine internal destinations and
outgoing hostnames. However, there is a way which works well, and which
has the additional value of being possible to implement as an HTTP proxy
as well. This could allow an HTTP Proxy and a SOCKS proxy to work in
unison, providing clients with the same identity on a per-host basis.
This in turn could allow for efficient, unlinkable WebRTC inside of I2P.
A simple application-oriented approach to isolating other types of
clients is possible without implementing a new tunnel type or changing
the existing I2P code by combining I2PTunnel existing tools which are
already widely available and tested in the privacy community. However,
this approach makes a difficult assumption which is not true for HTTP
and also not true for many other kinds of potentsial I2P clients.
The drawback, however, is that it requires some basic cooperation on the
part of the client. In lieu of isolating by-host, the client should send
an “Isolation String” as if it were a part of the username and password
sent to the SOCKS proxy server. For instance, if the SOCKS proxy
required username and password, then the isolation string would be
appended after the password as a third component. The username and
password would be authenticated first, and upon success, the isolation
string would be used to add a SOCKS proxy to the multiplex. If the SOCKS
proxy server required no username and password, *any* string would be a
valid “Isolation String.”
Roughly, the following script will produce an application-aware SOCKS5
proxy and socksify the underlying command:
This could allow for better and more sophisticated isolation in some
circumstances, because the isolation string need not consist of only a
hostname or destination. A wrapper could be created for ``torsocks``,
``i2psocks`` which would pass this isolation string to the SOCKS proxy
it would use. It would be aware of its own arguments, giving it the
ability to generate the isolation string on the fly based on the input.
``i2psocks curl http://idk.i2p"`` could produce an authentication string
like ``curlhttpidk`` giving it a destination which exists only for the
time it takes to run the application. ``curl`` is merely an example,
this approach would work for applications with longer lifetimes too.
.. code:: sh
.. code::
#! /bin/sh
command_to_proxy="$@"
java -jar ~/i2p/lib/i2ptunnel.jar -wait -e 'sockstunnel 7695'
torsocks --port 7695 $command_to_proxy
**Hypothetical Future: SOCKS is the Application, Contextual Identity is decided by the app or perhaps a wrapper**
__-> SOCKS Proxy(Isolation String firefoxi2pgitorg) <--> i2pgit.org
/
Browser <-> SOCKS Proxy Multiplexer(No Destination, No Isolation String) <---> SOCKS Proxy(Isolation String curlidk) <--> idk.i2p
\__-> SOCKS Proxy(Isolation String firefoxtranslateidk) <--> translate.idk.i2p
\__-> SOCKS Proxy(Isolation String firefoxgitidk) <--> git.idk.i2p
Addendum: ``example implementation of the attack``
`An example implementation of the Shared Identity attack on HTTP
User-Agents <https://github.com/eyedeekay/colluding_sites_attack/>`__
has existed for several years. An additional example is available in the
``simple-colluder`` subdirectory of `idks prop166
repository <https://git.idk.i2p/idk/i2p.host-aware-proxy>`__ These
examples are deliberately designed to demonstrate that the attack works
and would require modification(albeit minor) to be turned into a real
attack.
Citations:
''''''''''