All posts by nhnt11

Spotify Wrapped – Site Identity Hygiene Study

Today I was made aware of Spotify Wrapped. On the surface, it seems like a cool little tool to get a perspective of your music-listening habits on Spotify in the last year:

“Take a look at how you listened. Because no one else listened exactly like you.”

But when I visited the site, something smelled fishy (pun intended). I did some poking around and here’s what I found:

  1. The site is hosted on a non-Spotify domain: spotifywrapped.com
  2. The site uses a free Let’s Encrypt certificate that covers spotifywrapped.com and does not specify an Organization.
  3. The official Spotify domain uses a Digicert certificate that covers *.spotify.com and specifies that it belongs to an Organization “Spotify AB”.
  4. All the links on the website – Legal, Privacy, Cookies, etc. – point to spotify.com URLs.
  5. The site uses Oauth to connect to the user’s Spotify account and get access to their listening data. The Oauth prompt says “You agree that 2018 Wrapped is responsible for its use of your information in accordance with its privacy policy.” This messaging is confusing – “2018 Wrapped”? And “its privacy policy” – the site links to Spotify’s official privacy policy page and does not have one of its own.

Upon Googling, I was able to find a Spotify Newsroom (or “For The Record”?) article confirming that this is indeed a Spotify product. Spotify Newsroom/For The Record is hosted on a subdomain of Spotify, and is therefore trustworthy.

This is problematic, because I could very easily make another site (spotifywarped.com maybe? 😉), get a Let’s Encrypt cert, link to Spotify official policy pages, and maybe even use Oauth to get access to user data, all the while pretending to be Spotify.

This sets precedent, and conditions users into more easily trusting their personal data with websites hosted on arbitrary domains – making it easier to craft successful phishing attacks piggy-backing on brand trust. I think this is less than ideal for the security of the Web.

I want to acknowledge that people who build these products/tools are often under constraints and sometimes it’s necessary to make compromises in order to ship. But I think developers should prioritize site identity hygiene – for the health of the web.

This site should have been on a subdomain of spotify.com. Or at the very least, I’d have been relatively happy with a link to a Spotify-hosted document (like the news article) verifying that Spotify Wrapped is authentic. The more Web developers practice identity hygiene, the more browsers can take advantage of that to build tools and UX to help users protect themselves. It’s an ecosystem.

[Update] Unrelated to identity hygiene, but just realized the website also doesn’t support Firefox on Android – “This website is optimized for certain devices and browsers. Sorry about that.” 🙁

Small Talk Is Beautiful

A pensive mood engulfs my mind when I’m flying somewhere. There’s something about the incredible experience of being lifted into the sky in a metal cylinder – watching as layers of clouds pass by until suddenly you poke out into the clear blue expanse – there’s something about it that inspires thoughts of love and loss, culture and economics, space and time, and the profound arbitrariness of existence. Sometimes I find myself laughing out loud, hysterically, like a toddler.

Well, I’ve been flying today, and whether it was the euphoria, or being in the midst of travelers from around the world, or just the four cups of coffee I’d gulped down in order to power through after getting up at 5am, I found myself excitedly discussing the philosophies of context and language in a music-discovery WhatsApp group that I’m part of.

In 2018’s conflicted world of diverse beliefs and myriad manifestations of the human experience, I found myself drawn to the simple Earthly connection that is talking about the weather. No matter where you’re from, no matter what language you speak, no matter what you believe about existence and reality, we all understand sunshine and rain.

Small talk is beautiful.

Web-inspired strategies for linguistic change

People working on platforms like the Web often have to figure out ways to get people to use new APIs or stop using old ones. That got me thinking – can we borrow those strategies for propagating linguistic change in a social context? I ended up writing about it!

I have a pet theory (in development) about how the internet and the protocols that enable it are abstractions that naturally emerge from human communication – one example is how HTTPS requires a mutually trusted certificate authority to establish trust between two communicators. Real life analogy – you’re far more likely to trust someone who is introduced to you by someone you already trust, than you are to trust a stranger on the street. In other words, internet protocol design is governed by many of the same philosophies of truth, subjectivity, yadda yadda that govern human communication.

That was just for context! The other day, I was perusing a proposal to implement something called “feature policy control” over an existing Web API called document.domain. The tl;dr is that document.domain is a troublesome API and browser vendors want to drive down its usage. “Feature policy control” means that consumers will be required to declare their intent to use the API (or not). The proposal suggests that feature policy control for document.domain be implemented so that it remains supported by default at first, which preserves compatibility with existing consumers, and this can hopefully be changed to opt-in later on if usage of document.domain in the wild goes down eventually.

I think that this is a great example of how the Web platform (among others) is essentially a language, and that API designers and working groups are involved in the business of evolving it. It got me thinking about whether any of the strategies used to drive adoption or rejection of APIs might be applicable to the intentional evolution of English and other human languages at any scale. I know that I frequently apply abstractions I learn from software engineering and internet protocols in my own life.

Here’s a simplified example to illustrate the analogy:
Consider a scenario in which a person is trying to transition their gender pronoun within their Twitter community. One approach to transition might be to broadcast their new pronoun (e.g. by including it in their bio), but not actively enforce it. Eventually, people catch on and start using the new pronoun. After a while, once the new pronoun has been established, it may be time to start actively correcting people – enforcing the new pronoun – and perhaps stop broadcasting it in their bio.

The story isn’t perfectly realistic, but the analogy here is that the gender pronoun is the “API”, and the transition technique is the “feature policy control” – initially it is opt-in, but eventually it is enforced.

I’m intrigued by the possibility of finding more of these analogies and maybe finding inspiration for new strategies. Usually network protocols are well specified and edge cases are thought out – it might be interesting to feed that rigor back into a social context.

Definitions of Computation / The Universe is an Expression

A couple of days ago, I found myself in a tram in Berlin on the way back home from a Saturday-night ramen excursion. Among my companions was a friend of mine with whom I often engage in banter around various topics that range from social issues, identity, and various -isms, to math, physics, and philosophy. On this particular evening, our conversation veered into a particularly abstract realm: computation.

I was posed with a challenge:
What is computation? Can you define it?

I thought long and hard about this. My immediate answer was something like: computation is the process of elimination of relations from abstractions. I was thinking of a simple example computation: finding the sum of two integers. This operation takes a relational abstraction (the abstraction 2+3 relates the integers 2 and 3 in a well-defined way) and converts it to a single symbol (for 2+3 this is 5).

In response, I was introduced to the idea that computation is actually a *linguistic* concept. This baffled me at first, but the more I thought about it, the more it made sense, especially after thinking about the definition of “language” itself.

From Wikipedia:

The collection of regular languages over an alphabet Σ is defined recursively as follows:

  • The empty language Ø, and the empty string language {ε} are regular languages.
  • For each a ∈ Σ (a belongs to Σ), the singleton language {a} is a regular language.
  • If A and B are regular languages, then AB (union), AB (concatenation), and A* (Kleene star) are regular languages.
  • No other languages over Σ are regular.

This is not extremely critical to the final definition of computation I came up with, but it provided me with context and inspiration to think further.

In the simple arithmetic relation a+b=c, the = symbol is representing the equivalence of two abstractions: a+b and c. The left-hand and right-hand abstractions of this equation actually follow their own languages!

The alphabet of the language on the left-hand-side is the set of all integers, along with the “+” symbol. A valid abstraction (or expression) in this language must contain two “words”: the sum operation has two inputs. More rigorously, to express the sum of two integers as a single integer, the alphabet of the left-hand language requires a set of symbols that can be used to express the complete set of integers, and a delimiter symbol that is not in this first set.

Similarly, the alphabet of the language on the right-hand-side is also the set of all integers. A valid abstraction/expression in this language must contain a single “word”. No delimiter symbol is required.

Here’s my penultimate final definition of computation:
Computation is the process of converting a given abstraction in an arbitrary language to an equivalent abstraction in another.

I say penultimate because I have one final thought: I don’t think that it’s necessary to mention languages in the definition. Languages are modes of expression, and I’m not sure computation should be restricted to expressed abstractions. I might change my mind about this after further thought, but in any case, the important thing seems to be equivalence. I did indeed think about this more, and I realized that abstractions are ways to express abstract things – i.e. there’s no meaning to the phrase “expressed abstractions”. The word abstraction is a bit ambiguous it seems. In any case, the important part of this definition seems to be equivalence. In order for a computation to be possible, there must exist at least one way to express the abstract thing in both the source and target languages. This makes me appreciate the concept of Turing-completeness more – it standardizes a scope for expression itself and establishes the requirements for an arbitrary language to achieve that scope.

This is very very interesting to me. It seems to cover all the examples I could think of to test it. Here are three of them:

  • Binary arithmetic as performed by ALUs uses (in the simplest case) two input registers and an output register. The input registers and the circuitry connecting them to the output register together are a physical abstraction equivalent to a+b, and the output register is a physical abstraction equivalent to c. I don’t know about you, but I’m seeing a conversion of abstractions here, where the two input registers are the “alphabet” of the left-hand language, the output register is the alphabet of the right-hand language, and the circuit connecting them is an abstraction for the rules of equivalence between them.
  • A pseudo-random number generator is a function that operates on a seed and outputs a number. Seems to fit the definition of computation pretty well, similarly to the first example. If I may embark on a tangent, randomness itself is a pretty fascinating non-trivial concept. Implementations of random number generators require a seed or external source of entropy. Very cautiously, I suggest that generation of entropy is not a computational process. Indeed, computation might not be possible without entropy. I will leave you to think about this further on your own time, because I haven’t thought about this enough myself.
  • Bear with me as I enter some sketchy territory here: the process of human thought is a computation. The universe is a physical expression of the language of nature (there are branches of physics including quantum field theory and string theory that study the core abstractions involved). Humans express their thoughts in human languages. Thoughts would not exist without the universe, and so the process of thought is a conversion of abstractions expressed in the physical universe to abstractions expressed in human language. Corollary: the quest to find a “theory of everything” is a quest to find the language of nature.

I emerge from this thought process with a better understanding of concepts around abstraction, expression, information, and language. I know that Theory of Computation is a standard part of Computer Science programs and I look forward to perhaps diving into this topic further some day. Many thanks to Miko Mynttinen – the friend in question who got me thinking about this.

I like to end posts like this with quotes, so here’s a fitting one from Carl Sagan:

If you wish to make an apple pie from scratch, you must first invent the universe. 🥧 = 🌌

Insomniac

Found this one while perusing the contents of my phone; written in early 2018. IIRC there was more, but I can’t find the rest.

Sleep continues to
Remain ever elusive
So I write haiku

Dusk was long ago
Dawn approaches and with it
Another long day

As I lie awake
Pondering my existence
Lost in my own mind

Time relentlessly
Ticks on, ever stretching this
Cosmic symphony

Stars and dust and ice
Molecules in vibration
Somehow foster life

Wondrous as it is
Inescapable it seems
Is monotony

No RSVP Necessary

I wrote this in a five-minute burst of inspiration earlier this year following the first time I saw the aurora from the window of a plane flying somewhere over Greenland. As a lover of nature, I was trying to reconcile a raw loneliness from having no one I knew around me with whom to share the experience, with the (rather solipsistic) idea that all my loved ones as I know them are projections I’ve constructed in my mind anyway.

Fly with me over the mighty mountains
The racing rivers, towering forests
Sprawling deserts, open oceans

Soar with me through cotton clouds,
We'll rush past the moon, glide into the sun
Feel a trillion stars twinkle around us, breathless

Drive with me through the city
We'll wind our way through its maze
Of blinking neon lights and blaring horns

Stroll with me down the Pacific coastline
We'll leave our footprints to mingle
With seaweed and starfish

We'll explore this magical universe 
Adventure through the expanse of space and time
Zigzag our way between minds and matter

And in this marvelous reality
I experience all within myself 
No RSVP necessary

Because when I think of you and you think of me
What's the difference between me and we?

hg wip is slow af

My day job at Mozilla involves interacting with a large mercurial repository and keeping track of multiple chunks of work. I use a workflow that involves having a chain of commits per bug that I’m working on, with each chain descending from the current mozilla-central tip.

In order to quickly take a look at the state of my repository – i.e., the different bugs that I’m currently working on, which commit is currently selected in the working directory, etc. – I use the hg wip command, about which you can learn more here. This is pretty awesome, but it has one drawback for me: it’s SLOW AS FUCK.

To quantify the slowness for the sake of this blog post, I very scientifically time‘d hg wip a few times, and the fastest run was 1.4s. When you’re quickly iterating on a patch, this is BAD. For one line changes, making the change probably took less time than running hg wip.

Here’s my solution: cache the output of hg wip. The effect of this should be that hg wip is an instantaneous command (it should just dump a cached output), with the side effect that it might provide out-of-date information. In practice, this side effect does not bother me much. Here’s how I implemented caching:

In ~/.hgrc:

[alias]
wwip = log --graph --rev=wip --template=wip
wip = !tput rmam; cat ./.___WIP___ || $HG wwip; tput smam

In words, this basically remaps the original wip command to wwip, and makes wip dump a file, .___WIP___. Where does this come from? The answer is an addition to my .bash_profile:

hg() {
 command hg --pager=no "$@";
 temp_file_wip="$(mktemp)";
 (bash -c "unbuffer command hg --pager=no wwip > $temp_file_wip; mv $temp_file_wip ./.___WIP___;" &);
 return 1;
}

Boom. This acts as a proxy for hg: it transparently captures hg commands, runs them, but also (atomically) dumps hg wwip into the .___WIP___ file in the background, for future snappy hg wip runs.

Cheers!

PS: I do have fsmonitor enabled, but either I have unreasonable expectations from it or all my efforts to set it up and get it working correctly with my m-c clone have been futile.

Inner Monologue

Reality surrounds you. You see, hear, touch, taste, smell, and otherwise perceive your environment. There are a zillion things going on around you all the time, yet you get used to the overwhelming amount of information coming at you at a very young age – possibly as early as infancy.

With this bit of context, I want to talk about language. In my opinion, language is a powerful tool – perhaps the most powerful – to construct some satisfactory model of reality for yourself that prevents you from being overwhelmed every time you open your eyes.

Here’s an example to illustrate the point I’m trying to make. Imagine you are a being that has spawned into this (or some other) universe for the first time. You have no knowledge of anything whatsoever – everything is new. Let’s say you spawned into a completely empty room, with a single bright yellow light in front of you.

Wait. We already got ahead of ourselves. What is “empty”? What is a “room”? What are “single”, “bright”, “yellow”? These are completely new concepts! In order to form an idea of what “empty” is, you’d first have to experience a room that is full of things. Then, you might draw a comparison between the two. Further on in this thought experiment, you might realize that it’s not just rooms that can be empty or full, and you might articulate the more general idea of a container. Same goes for the other words – take a moment to think about them.

Simple, individual words that you use without a second thought represent and convey large amounts of information. They give you the powerful ability to qualify and quantify reality around you with relative ease.

In my experience, many people are familiar with the idea that language is a communication tool; it allows you to share an idea with someone else. But I want to draw attention to a specific use of language: communication with oneself – the inner monologue.

There’s a voice in your head that thinks the words you want to express before you say them out loud or write them down. That voice exclaims “Yes!” in triumph when you finally solve a difficult puzzle or beat a game. It goes “Fuck!” when you stub your little toe. And that same voice nervously goes over your rehearsed lines before a presentation or speech.

I’m going to model this voice as a character. An interesting tangent might be to study the link between this character and your sense of identity, but that’s probably worthy of a whole book or maybe at least an essay or paper. Anyway, let’s call this character Voicey McVoiceface.

It’s undeniable that you have an intimate relationship with Voicey. Voicey is who you listen to when you’re alone. When you’re engaging in very personal activities – when you’re showering, doing your make-up, preparing for an important meeting, when you’re in bed just before falling asleep, when you’re having an existential crisis – Voicey is there.

Through Voicey, you communicate with yourself. And when you think about it this way, you realize that Voicey is just as prone to being wrong as anyone else you know. Just as prone to being an asshole. Just as prone to being hurt, angry, afraid. Just as prone to being happy, loving, kind. I think it’s important to question Voicey. Call Voicey out on bullshit. You have control. Voicey is how you constantly describe reality to yourself – make damn sure that you’re not letting yourself get scammed.

These thoughts came from thinking about how the tone of my inner monologue affects my life. I feel that exploring these concepts further should be a personal quest, so I’ll stop here (a bit abruptly, I’ll admit) and leave you with this quote from David Foster Wallace’s This Is Water speech:

There are these two young fish swimming along and they happen to meet an older fish swimming the other way, who nods at them and says “Morning, boys. How’s the water?” And the two young fish swim on for a bit, and then eventually one of them looks over at the other and goes “What the hell is water?”

Side-by-side Diffs in a Terminal

Today I set up side by side colored diffs for Mercurial. This may not seem like a big deal, but there were a few problems I encountered:

  • Most solutions online point to using the extdiff extension – this doesn’t work too great with hg qdiff.
  • Side by side diffs require more screen real estate, but when I’m not viewing a diff, I want my terminal window to stay in its corner on my screen, at its usual 92×35.

My final solution involves using aliases, Xterm control sequences to resize my window, and cdiff.

Installing cdiff was easy using pip. Once that was done, I set up my aliases in ~/.hgrc:

[alias]
ddiff = diff
qqdiff = qdiff
diff = !printf '\e[9;1t'; $HG ddiff $@ | cdiff -s -w 0; printf '\e[9;0t'
qdiff = !printf '\e[9;1t'; $HG qqdiff $@ | cdiff -s -w 0; printf '\e[9;0t'

That first two lines “back up” the original diff and qdiff commands, then aliases them to use cdiff!

Before diffing, the aliases printf the Xterm control sequence to maximize the window, and then restore the window after diffing.

The -s flag makes cdiff do a side-by-side diff, and -w 0  makes it use all the available real estate.

That’s it! I’ve been using it all day and absolutely love it, so I thought I’d share.

Cheers!

Raspberry Pi as an OpenVPN Gateway/Router

Over the last week, I got myself a VPS on DigitalOcean and have been playing around with it. Something I’ve wanted to do for a while is to set up a VPN tunnel for myself, and I finally did it.

I decided to write a blog post on my setup. I have a Raspberry Pi set up as a router on my Wi-Fi network, and it sends all traffic over the VPN. I’m not going to get into the reasoning for why I’m using something versus something else for fear of getting into rants in what’s going to be a long post anyway.

The Server

I got myself a “droplet” on DigitalOcean with 512MB of RAM and a 20GB SSD and Ubuntu 14.10 x64. I uploaded my pubkey on creation of the droplet, so it automatically set up ssh to work with it. If you choose not to, it will email you the default root password. I recommend disabling root login and setting up pubkey authentication immediately.

The first thing I did was create a new user account for myself and grant it sudo access. Then I enabled ssh on an additional port (just in case) and disabled password authentication. Finally, I took a “snapshot” of the basic setup as a backup.

Installing OpenVPN

I followed the instructions here to set up the OpenVPN server. Make sure you get the right deb file for your OS – the one in the post is for Ubuntu 12.x. OpenVPN offers an auto-login config profile – I grabbed this from the web UI so my Raspberry Pi could connect without me having to type in a password every time.
That’s it! Now for the client side.

The Wi-Fi Router

My Wi-Fi router is setup in IP sharing mode. This means that traffic from all the devices in my room will appear to my dorm’s router as coming from the same IP, and I have a local network on the 192.168.1.0/24 subnet.

The Raspberry Pi as an OpenVPN Client

The distro I’m running is Raspbian “wheezy” from Septeber 2013. I’m using this because the image was already available on the campus FTP server. Setting up OpenVPN is easy:

$sudo apt-get install openvpn

After that, I copied over the auto-login config file:

$scp /path/to/client.ovpn pi@<pi's ip address>:/tmp/client.ovpn
$ssh pi@<pi's ip address>
$sudo mv /tmp/client.ovpn /etc/openvpn/client.conf

Now to start the client and test if it’s working:

$sudo service openvpn restart
$curl ifconfig.me

The output should be the VPS’s public IP – that means everything is working. If it’s not, keep curl’ing a few times – it might take a few seconds to take effect.

Finally, I added the following line in the OpenVPN config file to bypass the VPN for intranet IPs:

route 10.0.0.0 255.0.0.0 192.168.1.1

That will bypass the VPN for any connections to the 10.0.0.0/8 subnet (192.168.1.1 is my Wi-Fi router’s local IP).

The Raspberry Pi as a Router

I wanted the Raspberry Pi to serve as a gateway and DHCP server for my Wi-Fi network. To achieve this, first it needed a static IP. I edited /etc/network/interfaces for this:

auto eth0
iface eth0 inet static
address 192.168.1.11
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
gateway 192.168.1.1 # Wi-Fi router IP

Then, I needed to allow NAT:

$sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

To make this rule persist (from https://wiki.debian.org/iptables):

$iptables-save > /etc/iptables.up.rules

To restore the rules after a reboot, create this file:

$nano /etc/network/if-pre-up.d/iptables

Add these lines to it:

 #!/bin/sh
 /sbin/iptables-restore < /etc/iptables.up.rules

The file needs to be executable so change the permissions:

$chmod +x /etc/network/if-pre-up.d/iptables

Now, I was able to connect any client to the Wi-Fi network and browse through the VPN using the Pi (192.168.1.11) as the gateway!

The Raspberry Pi as a DHCP Server

Finally, I wanted devices to automatically use the Raspberry Pi as the gateway without any “advanced” manual configuration. To do this, I installed dnsmasq:

$sudo apt-get install dnsmasq

And edited the config file (/etc/dnsmasq.conf) to set the DHCP ip-range:

interface=eth0
dhcp-range=192.168.1.2,192.168.1.254,255.255.255.0,12h #start,end,mask,lease time

Now all I had to do was disable my Wi-Fi router’s DHCP server and voilà. Now any device connected to my Wi-Fi would automatically go through the Pi and hence the VPN connection.

Making DC++ Work in Active Mode

DC++ is widely used for file sharing on campus. Behind a firewall or router (like in my setup), I could only use DC in passive mode – which limits my search results greatly. To make active mode work, I set up my Raspberry Pi as a virtual DMZ station on my Wi-Fi router. This makes the router redirect all inbound packets to the Raspberry Pi. After that, it was a matter of setting up port forwarding.

First, I added this line to /etc/dnsmasq.conf to give my Macbook a hostname (nhnt11-mbp) and static IP with an infinite lease time:

dhcp-host=<macbook's mac="" address="">,nhnt11-mbp,192.168.1.12,infinite

Then, I made my Raspberry Pi forward port 1412 (TCP and UDP) to my Macbook:

$sudo iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 1412 -j DNAT --to-destination 192.168.1.12:1412
$sudo iptables -t nat -A PREROUTING -i eth0 -p udp --dport 1412 -j DNAT --to-destination 192.168.1.12:1412
$sudo iptables-save > /etc/iptables.up.rules

And that was it! My room is now fully VPN’d.

Automator App to Connect Pi to VPN

As a bonus, I decided to make a small Automator app to run a shell script to reconnect the Raspberry Pi to the VPN and display a notification when the connection was good to go. The content of the shell script is as follows, you can figure out Automator yourself 😉

#!/bin/bash
ssh pi@192.168.1.11 sudo service openvpn restart
IP=`curl ifconfig.me`
while [ "$IP" != "<VPS's public IP>" ]; do
    sleep 1
    IP=`curl ifconfig.me`
done
echo "Connected!"

That’s it! Cheers!