Wednesday, March 23, 2016

It's all about trust, context and timeliness - reinventing social media with blockchain

In the recent week after posting about my experience as /r/Bitcoin moderator, doing an AMA on the subreddit, and the subsequent discussion that followed in the related topics (1, 2, 3, 4, 5, 6), I've heard a lot of people voicing their opinion that the way Reddit works is rather broken. This further amplified by the recent research by BashCo into possible vote manipulation that might be going around in our subreddit, as well as a few people noticing that some of the most popular subreddits seem to be both heavily moderating the content about the Brussels terror attack, removing some highly-voted submissions and a lot of comments on the subject. Long story short - it looks like a lot of the discussion on Reddit is swayed and controlled by a handful of people controlling the subreddits (/r/Bitcoin notwithstanding). So would it be possible to reinvent Reddit and similar social media in a way that would make vote manipulation, trolling, spamming and centralization of power irrelevant?

Filter bubbles and Web of Trust


We consume a lot of the media and news. The more relevant something is to our interests, the more likely we are to consume it, so logically a lot of companies try to give us "filter bubbles":


In short, a filter bubble personalizes the news and stories we see in our social media feeds based on what we enjoy and have consumed in the past. If we're pro Bernie Sanders, we might hear more news from /r/SandersForPresident/ rather than from /r/The_Donald/. Companies like Facebook do this more covertly, while Reddit is more explicit about allowing you to subscribe to whichever subreddit sparks your interest the most.

We could use the concept of explicit and personalized filter bubble to figure out what news would be relevant to us. We would have to pick which things we're interested in (tech news, pictures of cute animals, videogames, etc.) to create the "context" for our interests (more on that later). However, this would only be as useful to us as the quality of the content to consume, which can be filled with spam and manipulated. Here is where the idea of Web of Trust comes in.

Web of Trust is a formal way of stating who do you trust, and creating a network of trust-based relationships between the users. This applies to both finding trustworthy websites as well as money. In our context, we would create our digital identity (or multiple identities), and then state which other identities we trust or distrust. Whether it would be family members, celebrities or just some random stranger on the internet posting funny pictures, we could cherry pick who is relevant to our interests. Those people, and the people they are connected with would allow us to access the relevant data feeds.

The Web of Trust can also be resilient against sybil attacks and vote manipulation - it wouldn't matter if I created a million identities and they would all "Like" some post if you wouldn't trust anyone trusting those puppet accounts. Trust could also be diluted in a PageRank fashion - if I trust a million people, I might be less reliable than someone that trusts only a handful.

Context is also important for trust. While I might trust someone to provide me with animated book reviews, I wouldn't be taking technical advise from them. Similarly, if someone was blatantly in opposition to my views, I might explicitly distrust them in that field. This would allow us to mix and match who do we trust and under what circumstances to make our filters and web of trust more explicit. For example, my trust might look like:






[Self]
Videogames
Wrestling
Bitcoin
Astronomy
0.1
0.9
0
0
0
LukeJr
0
0
0
0.2
-1.0

In this example, I would value TotalBiscuit's opinion on videogames to a strong degree - he's a competent critic with years of reputation. While he might also be a fan of wrestling, I don't share his passion for the topic so I'm indifferent to anything he would share in that content. He's not an expert on Bitcoin, so I'd also give him a 0.

LukeJr's opinion on Bitcoin I would trust a bit - he's a competent developer, although his abuse of trust in the Gentoo story leaves something to be desired. Similarly, his view that the Sun orbits the Earth makes me distrust whatever he has to say on the subject of astronomy.

The [self] variable would be a category about the person - in this example I would be interested in hearing a bit about TotalBitscuit's general life (similar to a Twitter feed).

Evaluating content


By establishing our web of trust with the proper context, we can create proper filter bubbles for the content we want to consume. If the web of trust data would be public, say, living on an Ethereum smart contract, we could pre-compute all the values we would assign to everyone relevant to us on a given topic. People we directly trust would get full score, whoever they trust would be counted as a product of the two trusts plus some discount on how far they are away from us. Doing the same for further connection levels we would eventually tend towards zero trust - irrelevance.

Every submitted piece of content would have a context - just like hashtags on Twitter or subreddits on Reddit. Generic content would have a context of the person posting it - just like a Tweet without a hashtag. The content itself could be a post or a link like on Reddit.

Now, when someone would consume a piece of content, they would give it a upvote or a downvote, affecting how the people that trust them would view the content. For every person, the aggregate score of a given piece of content would depend on everyone in their web of trust, and only that. So if I trust say, 10 movie reviewers and they all upvote Cloud Atlas, it would have a score of 10 for me. If they disagreed on The Revenant, it might get a score of 2-3. If someone posts a picture of a cat in the "movie" category, that would get a score of -10 - it might still be a cute picture, but it's not a movie.

Just like on Reddit, the concept of timeliness also needs to be taken into consideration. A lot of people would be interested in getting the latest news and the newest cat memes, so similarly to Reddit the overall position of a submission should fade away with time to leave room for fresh content.

All of these things can be weighted depending on how one wants to browse the data feed. We could see the most relevant submissions, the newest ones, the most controversial, or the top submissions of all times.

Lastly, one's votes on various submissions might also slowly tweak the Web of Trust of that individual. Every upvote might add 0.01 to a trust for a given person in a given context, and every downvote would subtract as much. This way we would organically adjust our filter bubble based on the content we consume discovering new content curators as we go along.

Technical aspects


From a technical perspective, things might be a bit complicated. The biggest challenge I would see for a system like that being deployed on a distributed network with proper cryptography to run everything on would be the overhead. While a simple system would just tally a score for every submission and update it every now and then when new votes come in, here every vote would have to be registered and parsed separately. Seeing how Reddit can have ~30 million votes per month (over 10 votes per second) and 230 million unique visitors, that can mean a lot of data to synchronize. A lot of it could be broken down based on the context - if you're not interested in Bitcoin, you don't have to synchronize the Bitcoin sub-branch, etc.

Probably implementing something like segregated witness could strip a lot of data that could be later pruned off, making the overhead a lot smaller.

One would also need to address the issue of bootstrapping new users onto the system - if only people that someone trusts would have their content viewed, it might mean a lot less content is posted in general. It could be solved for example by proof-of-burn - anyone joining the website could burn some small amount of money (or donate it to the network creators) to gain a bit of reputation from the generic account everyone would trust by default. This would allow users to start growing trust in themselves.

All in all, the concept might look a bit like Synereo, although a bit less focused on absolute reputation scores and tokenization:


Conclusions


The current social media can be heavily controlled and censored by a few individuals. Discourse on Reddit is further hampered by vote manipulation and spam. It might be possible to change how we discover content through the use of public Web of Trust and a decentralized network to submit the content through.
Previous Post
Next Post

0 comments: