So I’ve sank my nose into code for a few days. Time to recap …



The Journey/Trip

My goal was to understand what kind of infrastructure/processes are behind project I2P and its development. Curious as I often am I got into a lot of details, so this was mostly not very straight-forward …

First step, getting the source

I started by reading what the project provides as guides on the web-pages. Finding something on those pages isn’t hard, they are full of useful information. The structure/navigation is something that isn’t/wasn’t very intuitive for me, though. Lot’s of goodies hidden in the depths there, that kind of impression.

Anyway, the project has two main repositories:

  • i2p.i2p - the java router with all that is part of that
  • i2p.www - the project’s web-site and documentation

Next thing then was, that getting the code (if wanting to stay within I2P and anonymous in the process), isn’t necessarily as easy as running a “git-clone”. It’s possible, that it succeeds and I’ve done it successfully (it then took about an hour per repository), but on my next attempt I was restarting git again and again for several days without having success in the end.

In short: Need a ssh tunnel set up in the router. git clone will restart from scratch if the connection is interrupted. And I2P connections can become interrupted. There is tricks like using –depth 1 to only get a small part and then in a further step get more using the option –unshallow.

When I tried that to help the situation with the i2p.www repository, it didn’t seem to be of much help. Cloning with –depth worked fine, but the –unshallow invocation was interrupted again and again - and didn’t seem to do any resume, but start at (its) scratch every time from what I could see.

So that gave me plenty of time to read about the alternative way using git bundles …

git-bundles

A git bundle is a one-file archive containing a git repository. Sounds awesome, and seems to be awesome. They are rather easy to create, but depending on the repositories structure can be tricky to use then. The use case for this here is to provide such a git-bundle in a way it can be downloaded as fast and reliable as possible so that users of it can get started with a repository “without hassle”.

Now creation seems easy, and also is not a problem in the sense that it can be done behind the scenes or automated. Distributing the file … well, distributing files has been attempted, so there is infrastructure in terms of torrents that is ready to use. And with the proper options using the file then is a one-liner at first (there is connecting to upstream and verifying/updating from that then).

So sound idea, I wanted to try that out just for the sake of having seen it actually. It was quite a journey to figure things out then in the end. As so often. But yes, the process can be made as simple as I just described it. It wasn’t (necessarily) quite like that at that point, though.

For i2p.i2p there was a 3-years old torrent seeded (not well-seeded though, also took an hour to download). I downloaded that and after I figured out the needed parameters it worked fine. When struggeling with the i2p.www download so hard, I couldn’t find any bundle for that, though. So I also revisited the creation of those bundles and started wondering about the creation/distribution …

Mirror, mirror …

As the creation is rather straight-forward I started thinking about automation (see project bundle-bot). Shortly afterwards idk came up with updated bundles for i2p.i2p, so helping to seed those rather than to make own ones, seems sensible. While there is no i2p.www bundle yet (to my knowledge), I will keep seeding the version I created there. But generally I switched focus to “how to help seeding those” to make the download experience better for those that are going to use them.

While bundles could easily be generated on an hourly schedule or whatever, seeding (distributing to seeders until those have them ready to help with downloads) rather suggests longer intervals. As to have more seeders available for a version, instead of having them in an endless loop of trying to get a new version.

idk pointed out that tags (as in git-tag) for the versions mark certain points in a stable way. Also it might be possible to generate identical bundles if using the exact same tag? This would allow to seeders to generate them on their own, instead of having to download them, maybe. Maybe not, as from what I tried they have different checksums - no matter how I tried to make them “the same”.

That felt a bit strange at first, but git must deal a lot with “meta data” and there probably is plenty of room for time-stamps or alike to mess up “binary equality”. git just cares about the files in that sense, not about the binary checksum? That’s something I might look into a bit more at some point.

Ok, so tagged versions of i2p.i2p make good candidates to get from idk’s website and dump the .torrent into i2psnark from there. To automate that I’m fiddling with a script in the aforementioned repository.

Building the projects and looking into the code …

Aside from the detour described, I also started looking into the code-bases of the repositories. Even without compiling/running anything, this turned out interesting in many more ways when what I had expected.

“building” i2p.www

The I2P project has basically developed a content management system including a web-server that manages translations on-the-fly. I have been looking into Hugo lately, a static web-site generator, and the “python scripts” of I2P (that seems to be the call-sign) do pretty much the same, aside from generating a static version of the web-site. Which I actually don’t understand - serving only static content seems a way to rather securily provide web-content in a very performant way. But the point I wanted to get at is, these scripts must have been running for years in an important role and almost seem to be worth a spin-off project. But for the I2P project it is just “the python scripts”? I couldn’t even find any author information (the repository does have a license file, though).

So … to work on the web-site/documentation actually requires a building environment as any other software project then. How goes? Well, there is the probably easy way of “just firing up a provided Docker container”, and there is the other way, which seems complicated as so often. I didn’t find any documentation on this, really, and will need to revisit this. But I actually just wanted to help updating some documentation and looked into doing this while postponing this “building” aspect.

The Dockerfile contained in the i2p.www repository seems to include all commands and dependencies as required on a Debian system. So let’s at least give that a shot:

root@test:~# apt install docker.io
[...]
root@test:~# git clone -b master --single-branch i2p.www.bundle-complete-git-202
3-01-13 i2p.www
[...]
root@test:~/i2p.www# vim config.sh
i2p_www_docker_run_args='-d'
root@test:~/i2p.www# ./site-updater-docker.sh 
[...]
compiling catalog 'i2p2www/translations/zh/LC_MESSAGES/research.po' to 'i2p2www/translations/zh/LC_MESSAGES/research.mo'
compiling catalog 'i2p2www/translations/zh_TW/LC_MESSAGES/research.po' to 'i2p2www/translations/zh_TW/LC_MESSAGES/research.mo'
fatal: not a git repository (or any of the parent directories): .git
Git revision: 
[...]
Successfully built 29a6c415d985
Successfully tagged mirror.i2p.www:latest
Error: No such container: mirror.i2p.www
1eca480ab9d09bd976250b466136ff7f1a754f43b4d3eeb3cad583111422d596
root@test:~/i2p.www# docker ps
CONTAINER ID   IMAGE            COMMAND                  CREATED         STATUS 
        PORTS                  NAMES
1eca480ab9d0   mirror.i2p.www   "/bin/sh -c 'service…"   5 minutes ago   Up 5 mi
nutes   0.0.0.0:8090->80/tcp   mirror.i2p.www

Aside from the error messages, this brings up the site on port 8090 nonetheless. While it seems fine on the first glance, errors probably mean that something isn’t working as it should. Will need to investigate what that might be … It’s the first time I used Docker, though, so it might be something rather obvious/minor.

Note that I “trust” the bundled repository here and for this test did not connect it to “upstream”, as I did build it myself as described/linked above (i.e. you should do that here!).

The git error is from Dockerfile#L19. So it’s just “writing” the version here, basically, but I think the impact might be bigger as the python scripts might utilise that in some way …

Working on i2p.www without building

This all also means: the pages themselves are … “code-stylish” let’s say, meaning you cannot edit much content without dealing with tags and code and the logic behind it. I had started to make small corrections and quickly hit walls:

  • Translations are done via gettext and for that all content is broken down into translation tags (think paragraphs). Any change will probably lead to “alarms” for the translators, but I haven’t looked into that process yet. Seems crucial to understand and makes “just refactoring” … well, not a no-go I guess, but something requiring to weight the impact in terms of improvement against the effort/impact on the translators? I refrained from many “hey, wouldn’t this be better/nicer like this …” improvements and stuck to “this is factually wrong and needs to be corrected sooner or later” things. Thinking about it now, it might have been the right occassion to do any other suggestions/changes, though. As in … “if you change something any way, then make all “outstanding” changes at once”? As I’m new to this and feel like harrassing the “real devs” endlessly with questions already, I just stuck to a minimal set I assumed “needed”.

  • Translations can also concern URLs if those need to be adopted accordingly (language code as /en/ in the URL). It’s done by defining a variable within the translation tag in question that uses a function call to retrieve the “translated URL”. The variable is then used instead of the URL (requiring tags again). This also means, you can’t just add an URL, but need to “build” the project and test if your tags (“code” to define and use that URL) actually works. And there is the little things, like variable names. I found quite a few variables that are still named “monotone” or alike but correctly point to git. However, I didn’t touch those variables, as I’m not sure about their scope and how they are handled. Even if I could have just renamed them as their scope is local to the tag as I assume. Without building/testing (to make sure I didn’t have a typo in there “breaking” the build) that’s not a good way to work. It basically makes others (checking the merge request) do (my) work then.

So lots of stuff to consider it seems.

Building i2p.i2p

Before looking into the whole Docker thing (which I fear might take me some), I wanted to check about building the i2p router. I mean, now that I got the sources … ant debianhowto summarises the needed dependencies, let’s go:

root@go:~# apt install debhelper ant debconf default-jdk gettext libgmp-dev po-debconf fakeroot build-essential quilt dh-apparmor libservice-wrapper-java libjson-simple-java devscripts libjetty9-java libtomcat9-java libtaglibs-standard-jstlel-java libgetopt-java

That were some hundred packages extra there, so takes a few.

lbt@go:~/synced-git.idk.i2p/i2p.i2p$ ant debian
[...]
debchange:
     [echo] Debian version is 2.1.0-2-d45df857-1
     [exec] dch warning: neither DEBEMAIL nor EMAIL environment variable is set
     [exec] dch warning: building email address from username and mailname
     [exec] dch: Did you see those 2 warnings?  Press RETURN to continue...

debian-binary:
     [exec] dh patch --with systemd,bash-completion
     [exec] dh: error: Unknown sequence patch (choose from: binary binary-arch binary-indep build build-arch build-indep clean install install-arch install-indep)
     [exec] make: *** [debian/rules:54: patch] Error 25

BUILD FAILED
/home/lbt/synced-git.idk.i2p/i2p.i2p/build.xml:2642: exec returned: 2

Total time: 1 second

First thing to note is, that “Press RETURN to continue …” did not wait for sh.. ;) But what’s the error then, I checked that “patch” is installed, so what gives?

After this, I wasn’t able to build intaller-linux anymore, which I thought had worked before. Resetting the repository’s state with git checkout . did the trick here.

But also … do I need build Debian packages actually? That’s part of the release process. For testing some modifications some other installer might even be better? So let’s try that installer-linux target:

lbt@go:~/synced-git.idk.i2p/i2p.i2p$ ant installer-linux
[...]
BUILD SUCCESSFUL
lbt@go:~/synced-git.idk.i2p/i2p.i2p$ java -jar i2pinstall_2.1.0-2_linux-only.jar -console
[...]

I’m used to running i2p as a service and here did a user install for maybe my first time. That went smooth and with ./i2prouter start I verified to be able to run “the product” ;) Nice :) Let me note this was 2023-01-18 (in the opposite to the creation date of this post) for possible future reference …

Another thing to mention is a difference between GitLab and the actual sources. When looking up where the string “Specify Port” is used in the sources, a search on GitLab always felt kind of “spammed with translation files”, i.e. the majority of the search results were translation files. As I actually wanted to narrow down where code only, I was looking into how to exclude PO files from the search. Turns out that is a feature you only get with “premium” or what. Here’s the freemium version using the local sources ;)

find . -type f | grep -v .*\.po$ | xargs grep "Specify Port"