Long Live to the Semantic Web

I had a conversation this week that reminded me of my old days of thinking that the Semantic Web was going to make the world a much better place. Well, I can’t say that I was right there. Before I go further on why, let me step back and explain what I thought the Semantic Web was going to accomplish:

One of the problems of the internet is that you build these linked silos (sometimes not even linkable) of information and activity. Let’s say that you want to go to a coffee shop that is walking distance to you, has good reviews, and that a friend of yours has recommended. Today the only way you can do it is to hope that there is a service out there that has merged all those features and that is nice enough to allow you to filter things that way. But there are services out there that allow you to know what is walking distance (for your definition on how long you are willing to walk), there are reviews, there are communications with friends. It’s just that they are not connected in any way.

What the Semantic Web was going to do was to allow you to have a “meta-internet” which allows people to build features on top of other people’s data and activities that would tailor to specific goals. It would still maintain “ownership” of the data (e.g. it wouldn’t require somebody to crawl and keep a copy of all the data), it would just abstract the presentation of the data to a specialized system. And that system then becomes the source for even more specialized systems.

Sounds cool? Well, I thought it did (and actually part of me still thinks it’s cool). But it didn’t work (there are still semantic web researchers out there, so maybe some people think it hasn’t worked yet, but it may still one day work). Why didn’t it work? There are many really hard problems:

  1. There are many technical challenges of doing this distributed dynamic queries efficiently considering you don’t own pieces of it, so you have to constantly deal with potential availability and latency issues.
  2. Representing data in a way that it’s expressive and reusable is very hard. On my days of working at the Amazon catalog, I’ve learned the cost of getting high quality domain-focused data at scale. Imagine getting data that is “complete” so that it can be correctly connected to other things… So expensive that companies would need to have a very strong financial motivation to do it. Without owning the experience and guaranteeing stickiness, ad impressions, and constant feedback on how you can improve things, it’s very hard to show that financial motivation.
  3. Building the higher level applications also isn’t easy. The logic-based language that is used to represent knowledge and be able to navigate through it isn’t for the faint of heart.
  4. There are lots of security considerations around interacting through data.

That’s it. Can’t be done? Well, maybe there could be some partial light at the end of the tunnel. The key piece that is changing is that we now have much higher level “programming languages” that might be able to at least solve for (3) by moving away from this complex logic-based language into more of a natural language approach. Behind a lot of what voice assistants do today is technology that was researched by the Semantic Web people.

Another thing that has continuously evolved is on making data available through APIs. It’s not the same as (1) where everybody is talking the “same language” and you can connect concepts almost “out of the box” by just adding configuration. Every API that you integrate you need to write code specifically for it. But back to the higher level programming languages idea, maybe it’s becoming pretty easy to consume those APIs more consistently, so we can build that joint knowledge.

Now we are left with financial incentives for companies to open up their data and security considerations. On the security side there are some pieces of support for it, but making that efficiently and have some transitive permission model is still an open question (which points towards things like OpenID, which didn’t get much traction either). On the financial incentives, well… That’s not my area of expertise. If it wasn’t so expensive to do, I’d harbor some hope that a Wikipedia-like solution would eventually happen (like Wikidata), but I’m not so sure it can.

Maybe it will be built internally at a company to power their voice assistant and search and then the US government will come along and force it to break it apart and the only way they can keep powering both with the same source was to open it up for everybody to use. Yeah, wouldn’t that be cool?

The right tool for research

Sometimes you are looking for something on the internet that is too specific for generic tools like “Google” and “Bing” to actually solve. I was given one of those the other day: “List all US astronauts not born in Ohio”. There are four options for tacking questions like this:

  1. You keep trying multiple “general” searches trying to find a page that contains your answer by looking for specific words that might be on the list. Things like “Ohio astronaut Akers Altman” and find Wikipedia’s list of astronauts by state. Going directly to Wikipedia might also have worked, once you know what to look for.
  2. Go and search for specialized sources, like “astronaut history” which might take you to Nasa’s Astronaut Bios, where you can download a factsheet for all astronauts and build your list.
  3. Know your sources and look for specialized structured search sites, like FreeBase (a Google property now). And then build your search… Profession name is Astronaut AND Place of birth is contained by name is not Ohio AND Place of birth is contained by name is United State of America. Voila!
  4. Give up

The question on #3 is how precise is the answer. I’ll say that it’s ok… There are some US astronauts that don’t appear when you search for “Place of birth is contained by name is USA”. And the list of astronauts does contain some people that actually never made it to space, but it’s the closest “precise” source that I have found. Wolfram Alpha doesn’t have this information, nor does Factual or TrueKnowledge (which is overrun by ads right now – very sad).

The other question on #3 is how easy it was to build the query. Because I had played (and contributed) with FreeBase before, it wasn’t too hard. The “Place of birth is contained by” part was the hardest, but there weren’t too many other options. It was also a little slow, which made it a little frustrating. So thinking about people without a lot of background on data modeling and structured search, I don’t think they would be able to use it. So it’s not really there yet.

But it was fun!

2011 Gadgets – the year of Apple…

After Apple’s ridiculous quarter earnings, it should come to no surprise the summary provided by gdgt on the most used gadgets of 2011:

gdgt Zeitgeist 2011

Where Apple shows impressive numbers everywhere:

  • 5/10 top gadgets launched in 2011 were Apple’s
  • 5/10 top gadgets in 2011 (launched not necessarily in 2011) were Apple’s
  • 2 of the top 3 mobile and desktop platforms are Apple’s
  • 6/10 top Christmas gadgets
  • Finally, 8/10 trashed gadgets

Yes, gdgt’s numbers are biased towards more tech-inclined higher-earning people, so they will be more Apple-biased than the average population, but it’s still quite impressive. Let’s see how long they will be able to continue on this roll.

Privacy-supported music?

I was reading this article today about a new record label, DigSin, that will be offering their music for free on their website.

Why Record Label DigSin is Giving Away its Music – from TheNextWeb

How are they paying for their music? By getting more information about who is listening to it! It’s interesting how quite quickly people are realizing the cost of privacy, and the economic benefit that you can get if you are able to break that privacy barrier.

Today, as far as I know, Spotify and iTunes provide analytics to the record label about how many people purchased/listened to their music, and from where (I think that Spotify also provides time information). But owning your own analytics is always better if your business is around user analytics (analytics for this blog is great to have, but as I don’t get any money from it, whatever the platform, Squarespace, provides is better than what I would have time and knowledge to generate myself). Also, they will probably request email addresses for those users, which is much more powerful than anything that larger services can provide.

Where is this trend taking us? A lot of more people knowing things about us. Is this a good thing? For people that believe that relevant ads are good for you, yes! For me: it depends on how and when it’s used. If I can then go to their website and say: “Hey, I’m looking for something to do next month. What is something I’d be interested in?” and they would be able to provide me with artists that will be having concerts around here that I’ll like based on my past downloads, I’ll gladly provide them with my information (assuming that they have their security in place and I won’t start getting spam in a few months because somebody accessed their database).

Anyway, I could go a long time writing about this topic. I’ve had the opportunity in my past to work on a few “personalization” projects and had to think about my “limits” of what I think is good for people and when we cross that line. Privacy is good. But recommendations (algorithm or human-based) are what allow us to choose in this world full of choices.

Windows Phone 7 Voice and Siri

I watched this very short interview with Craig Mundie for Forbes:

Been There, Done That

And it cracked me up! The second part of the interview he says that everybody is overhyping Apple’s Siri, because Microsoft (and, if one should follow his line of thought, Google) has had voice commands on the phone for over a year now. The part that he doesn’t understand is that there is a fundamental difference between what Windows Phone Voice does and what Siri does:

Windows Phone provides you with voice shortcuts to doing things. You can say “Call …” and it will make a call. “Search …” and it will use Bing to search. So, if you know the right phrases, you will get exactly what you ask for.

Siri, on the other hand, tries to take this voice recognition to a “conversation” level. Yes, you can do things like “Call 425-555-1000” and it will also call that number. However, it can do things like “Call my wife” and the first time you do that it will ask you to tell the name of your wife. After that it will know and will call that number all the time.

Does that sound the same? Well, now for a different example: I can ask Siri: “Will it rain today?” and it will reply with things like: “There is no rain on today’s forecast” and show me the forecast. Again, why does Microsoft think that they have a similar product already? It’s amazing and very sad how blind some companies can be from time to time.

RSS Feeds still alive?

Tonight I was in a conversation with a set of not-so-techie friends and suddenly one of those friends proclaimed her admiration for RSS feeds! That was quite shocking to me, as multiple people out there have been proclaiming RSS as a “dead medium” for some time now (e.g. [1], [2]), and saying that something more like Twitter, Facebook and Google+ are better ways of navigating and sharing articles.

There are multiple things against RSS:

1) It’s focused on recency and not on relevance. That makes it very hard to use for things that generate too many articles or many updates to articles, such as some news agencies.

2) It keeps people away from ads, which is what allows thos companies to post the articles to begin with. So, in order for them to make you go to their website, they end up posting only a small part of the article and hope you will click through and see the full article and the full set of ads. Some sites do it almost right, such as Engadget. But some are just unusable, such as Estadao.

3) You lose all formatting and “metadata” that enriches your experience trying to read articles. For example, if you go to the NY Times website and read an article there, you will see that at the end of the article it will provide you with recommendations of related articles. Also you don’t see comments and you can’t easily enrich the article with discussions.

4) It’s actually not very easy to use. Some websites have multiple RSS feeds to choose from and many RSS readers are very clunky to add new feeds.

So seeing that there are people out there that are not deeply into tech and use RSS religiously and happily is quite strange. Even Google that has one of the most popular Web-based RSS readers (Google Reader) is giving hints to be moving away from it, and focusing more on Google+ that gives you the possibility of commenting, aggregating to the top the most “interesting” articles (as defined by your circles) and in the end just forwards you to the content provider’s website where you can see all the ads and features that they provide with the article.

What about me? Well, I still use Google Reader. From time to time I do go for a week or two without opening it, but it’s not that I find a different source for my news – it’s just that I don’t have time to read anything. I did try to use Twitter for it, but I just don’t follow enough people, and don’t spend enough time tracking links (the problem with Twitter is the firehose effect – you either keep constantly up-to-date with what is going on, or you miss a lot of interesting things).

Facebook never really worked for me. Yes, I do have quite a few good friends there that share interesting articles, but most of the time all I see is my other friends talking about their lives. It’s interesting, but doesn’t really give me enough news coverage. And Google+? Well, I just haven’t invested any time there at all. I still have way too few friends there, and even fewer that post anything.

So, do I think RSS is going to die? Yes! I just don’t yet know what is going to replace it.

Printable memory – more than a million entries?

I was looking through my past emails today and I came across a new product sold by Inventables:

Printed Rewritable Memory Development Kit

I first thought the focus was going to be on the “printed” part, but actually it’s just around being a flexible memory storage component. So I was looking a little further to know how much it can store and I read:

Each Thinfilm Memory™ sticker costs about 5-cents apiece in volume and each sticker contains 20 bits of data, corresponding to a lookup table that can store more than a million entries.

So it’s a 20-bit memory. That by itself sounds very small (you can’t even store 3 ASCII characters). So they call the brilliant marketing people and come up with something that might make it sound usable. Actually what they really are trying to remind people is that:


That’s big, right? I guess it all depends on what you are going to use it for. If it’s just to store some sort of multiple-choice configuration (should the toy say “hi” or “oi” when connected to the base?), then it’s probably good enough. However, if you want to store things like a serial number so that you can track something, then you might end up hitting the limit at some point.

In any way, it’s still an interesting technology. It requires contact to a reader, so that limits its usability somehow, but I can see it being usable for things like toys and art pieces (think of something you can control the color or frequency by switching physical objects on well-defined connectors – the memory is used to identify the object that has been connected).

New stuff

So here I am again to talk about a lot of random topics that I’ve completely kept to myself for the last couple of months. Busy months, but not in the same way that most of the rest of my year was busy. Yes, I have still been working more than the average person in the US, but it hasn’t been nearly as crazy. Work changed to being much more focused on designing the “next thing” than actually getting something out of the door (the Kindle Touch). But I won’t talk too much more about it, just that I didn’t really have anything to do with the “famous” Kindle Fire, even though I did buy one and have one at work for development. I’ll talk about it later.

More things that happened: my first niece, Sophia, was born and I went to Sydney, Australia to visit her (and my sister and brother-in-law, just because I might as well say hello to them if I’m there). It’s strange to hear my parents calling themselves grandfather and grandmother, but besides that it was great. Australia is a country worth visiting! After spending a week in Sydney, we went to Tasmania for 3 days and then Melbourne for another 3 days. All amazing places! I’ll post links to the pictures when I finally finish selecting a reasonable number of them.

After I came back, weather started turning colder and leaves started making a mess on the ground everywhere. But that didn’t really make me stay home more often, as we are starting to plan home remodel phase II, and there are always many things to do during the weekend, just really prepare to do it through buying more toys:

iPad – Well, technically I bought this just before our trip to Australia so that I would be able to travel without taking my laptop and it worked alright. It’s just that some hotel WiFi authentication systems don’t work very well with the iPad (or iPhone). Nothing really interesting to mention here. It’s an iPad and it’s a very solid piece of technology. There are lots of applications for it, even if it’s less than for the iPhone (and it’s very annoying to use iPhone-built software on the iPad)

iPhone 4S – I really considered jumping ship to Android when I was switching my phone, but I wasn’t very happy with any of the Android phones out there. Too many mixed reviews with people complaining about battery life, lack of stability, weird update behavior, etc. And I’m so used to all the applications that I had on my iPhone 3GS that I decided to be “safe” and continue on the platform at least for another 2-year cycle. Anyway, it’s not a huge change from my 3GS. Siri is pretty good, which makes me use it quite often, but I’m not as bullish about it as many tech analysts out there, but I can see how it is the closest that I’ve seen to a real usable speech-based UI so far. It’s certainly not something added to the phone just to catch up with Android (which has voice input, but not really a system that can answer natural language voice questions, like “is it going to snow today?”). It’s always great to see technology moving forward!

Skyrim – Probably the most expected game of the year, for friends around me. Yes, there were tons of sequels this year, like Portal 2, Uncharted 3 and Gears of War 3, but out of those sequels that I’ve played, Skyrim is hands-down the best of them. Just like any Elder Scrolls game, it’s not a very quick game (I know people that have already clocked something like 80 hours on it, considering that I has only been out for 9 days now), but it has the level of depth, intrigue and oddness that just makes you excited about most of the time that you are playing the game (not the time that you are lost in the dark trying to find your way to your next destination on the other side of the map).

Kindle Fire – as I’ve mentioned, I did buy one. At $200 it was difficult not to allow my curiosity about owning an Android device to win (even if it’s nothing like your typical Android device). What are my impressions, you might ask… Well, it’s not a bad device. There are quite a few things that work well, like the cloud player integration, the Kindle books, and the Amazon Appstore. However, it’s also a little finicky at times. Some applications crash from time to time (although I heard things like that from any Android device), and only having soft buttons (including volume control) is a little weird in some applications. All in all, it’s certainly a good buy, but I’ll be more excited after a couple of updates (and I’m not talking about new functionality that I may or may not know that is coming in future updates, more about bug fixes and clean up of some weird UI oddities).

And that’s it. As you can see, I can keep myself busy for a long time! And this doesn’t even talk about pre-existing projects.

Angry at Flash updates

It’s easy to be annoyed with the amount of updates we receive on software lately. Especially when you have many of them, like on my iPhone (and it’s not that I have that many things on my iPhone, but I always seem to have at least 5 updates a week for my programs). Or when it’s a Windows update and Microsoft decides that you should take it and they should restart your computer without asking (making me lose work that was running on the computer overnight).

Today, however, I want to complain about Adobe Flash updates. And I’ll focus on two specific points on them: the amount and the information on the updates.

Amount: they seem to happen all the time! And, for some reason, my windows machine at work has both Flash 9 and Flash 10 installed, so I get two updates at a time!

Information: if you are asking me to update something, please tell me what I’m updating. I think that the windows update provides some information, but the Mac one says:

But then when you try to “See details…” you get to a page that contains the release notes to all versions, and not just what I’m trying to update. Thanks, Adobe!

Spotify and Classical Music

There are lots of discussions going on for some time about whether a streaming music service like Spotify, MOG, Rhapsody, Grooveshark, etc. The discussions go around whether it’s possible to support the music industry paying fees like $0.30 per track listened, or even much less (source).

Things become a little bit more complicated when you get to classical music. On popular music, tracks are much more comparable: an album with 12 tracks usually takes 12 times more work to produce than a single with one track. So paying by the track is not too bad. However, when you get to classical music, it’s hard to compare the cost to produce a single track for something like Morton Feldman’s Piano and String Quartet that has one single “movement” that is 65-77 minutes long (depending on the rendition). On the other hand, you get things like operas that have tracks that are just a 30-second recitative between movements, or, if you want to go to the extreme, I have to mention John Cage’s 4’33” (and yes, there are recordings of it).

So maybe you can claim that on average is good, but is average enough? Will this just create incentives for music to all be based on many short tracks and it will become less than it is today? Will we have similar quality reduction than the one generated by the famous “loudness war” that happened with CDs? Spotify doesn’t think so, according to their blog post:

Why Classical Music Needs Spotify

But I guess it’s their job to believe that music streaming services are here to actually save the music industry from the doom that digital music is causing it.

My opinion is actually that music cannot be ever treated as a “one solution fits all” for how you consume it, how you interact with it, and how you pay for it. The more the simplifying effect of mass marketing of technologies reaches all our experiences, the less quality we will end up getting from our diverse experience.

Whatever happens, no artistic expression will ever disappear. And I believe in technology. I think that we will probably go through a period that there will be a drop of quality to what we get exposed to, but I think that’s just necessary to get the technology to stabilize and people to understand the actual economy of it. Then there should be another expansion, which will bring back quality and diversity with the right price.

It’s just like when music recording started. You first had just live performance. And the sound quality was great! Then came the first recordings and everything was convenient, but terrible sound quality. Slowly then sound quality was improved with new technologies to a point, at the peak of vinyl, for you to get amazing depth of sound if you wanted to pay for the equipment.

Then we started a new wave of quality reduction in the name of convenience. We’ll have to ride this wave, and on the other side of it I believe we will find a world where we can enjoy more of the things we like.