Amy Guy

Raw Blog

Sunday, July 28, 2013

Week(s) in review: #SSSW2013, figuring stuff out and annotating YouTube

8th - 14th July

Semantic Web Summer School, much heat, much fun, much learning... Here's an index of my posts.

15th - 21st July

Friends visited.  Progress included writing notes to myself to figure out just what my PhD outcomes really are, and why.  Came up with:

1. Recommending how to usefully describe diverse amateur creative digital content (ACDC) using an ontology.
    a) What are the parts of ACDC that need to be represented?  Identify and categorise properties. How do these differentiate it from other similar content?
    b) What existing ontologies can be used to do this, and how do they need to be extended?
   
2. Building an initial set of linked data about ACDC, and providing means for its growth and use.
    a) Manual annotation of ACDC, and refinement (to test ontology).
    b) Tools for automatic annotation of the parts of ACDC that it is possible to automatically annotate.
    c) Tools for manual annotation by the community of content creators and consumers for the parts of ACDC that cannot be automatically annotated.
    d) Tools to expose the linked data for use by third-party applications.

3. Create and test an example service which uses the linked data to benefit content creators and/or consumers.
     eg. Unobtrusive recommendations for collaborative partners (most likely); content recommendation; content consumption analysis (like tracking viral content); community building / knowledge sharing in this domain; ... .

22nd - 28th July

Brainstormed with Ewan about stage 3 (above), and came up with the idea of an interface that allows content creators to allocate varying degrees of credit for roles played by different people when collaborating on a project.  This would serve to both gather collaborative bibliographic data, learn things about how different segments of the community allocate credit, and provide a potentially useful tool for content creators.  With the future value that, if we can learn enough to estimate role inputs from different people, it could be used for things like automatic revenue sharing.

Then spent the rest of the week in London, frolicking amongst the YouTubers (including attending a meeting at Google about secret YouTube-y stuff), and annotated some ACDC.  Write-up coming soon.

Thursday, July 11, 2013

#SSSW2013: Social semantics and serendipity

We started work on the serendipity project before breakfast today, although I didn't make it down as early as some of my teammates.

To start the day, Fabio Ciravenga talked about some really exciting practical applications of monitoring and analysing social media streams.  It's particularly interesting during emergencies, or large events where problems might occur.  The people on the ground make the perfect sensors if you can work out the differences between people who are saying something useful and who aren't; people who are really there, and people who are speculating or asking about the situation.  A main problem has been that people tweet crap.  They were trying to monitor a house fire, but so many people were tweeting lyrics from Adele's various singles at the time, which all apparently contain references to fire, it was almost impossible.

They also put (or tapped into existing) sensors in peoples' cars to monitor driving patterns with the aim of more fairly charging for car insurance.  I told my Mum about this the other day, and she was pretty alarmed by the idea.  Which made me wonder how they'll get mass adoption, if it's going to go anywhere.

Fabio did have some interesting things to say about using all this data ethically though, and never working for someone who is going to take that away from you.  But in case the 'bad guys' do find out about all this data you have about people, keep a magnet handy.

My notes are here.

This was followed by a hands-on session where we got to mess with a mini version of the twitter topic monitoring system that Fabio's team use at large events, to try to answer questions about the Tour de France only by manipulating the incoming social media streams and following only links which came through that.

Spanish omelette sandwiches were an amazing outdoor leisurely lunch.  We headed to the pool down the road and chilled out there for a couple of hours.  Us tough British folk found the water pleasantly tepid, whilst all those wimpy Europeans and Latin Americans shivered on the grass.  They'd made such a fuss in advance about how cold the pool was going to be.

We regrouped that afternoon to work on Project Cusack, creating a slide deck of pictures from Serendipity.  I don't like slides with too much to read on, so I enforced this.  The imagery from the movie will be lost on most people, but we have at least managed to choose pictures of John Cusack with appropriate expressions for each part of the presentation.  We worked outside in the forest, because Oscar's 3G was faster than the residence wifi.



We also brainstormed for the required short film, which we only just discovered doesn't have to be about our project.

We returned to the residence to find everyone eating ham and cheese, and attempted to get some shots for our film, but other people were unwilling to participate.

That evening we ate tasty vegetable soup, weird (in a bad way) pasta in a creamy onion sauce, and chocolatey ice cream cake.  The tutors spontaneously organised a game where students had to arrange the tutors by age, which was funny.  Someone suggested the tutors ought to play it with the students.  Obviously there were too many students, but they elected to find the youngest student, and that turned out to be me.

[Notes] Fabio Ciravenga at #SSSW2013


Make a model of what is happening.

WeSenseIt - citizen water observations.

River belongs to citizens, not authorities.
Physical sensors (hard layer) are expensive and brittle.
So use people instead (soft layer, social).

Give people small sensors.  Phones.
Then you just need software for information management.

  • capture.
  • integrate and correlate data.
  • share.

Can't rely on phones.
Old people in Doncaster.

Give them easy sensors instead.

  • camera
  • humidity
  • position GPS
  • water depth, velocity
  • rainfall via accelerometer
  • could coverage via luminocity

Costs about EUR 80.

Open Source & hackable.

Not expected to substitute professional sensors, but a way to crowdsource information you would never get.


In Delft

Give people flood preparation advice and record who ticks things off, to build a picture of who/how/when preparations take place.


The Floow Ltd

"Commercialises data solution for telematic insurance."

World divided 10x10m squares, sense things everywhere.
Traffic risks.


Sensors tell you people are going somewhere, not why.
That's what social media can tell you.



Monitoring development of a house fire via Twitter.
Seeing events through the eyes of the community.

Social streams:

  • High volume
  • Duplicated, incomplete, imprecise, incorrect
  • Time sensitive / short term
  • Informal
  • Only 140 characters
  • Spam

Large music festival.  Monitor geolocated messages, trends, topics and relations.

Most 'critical' events were management issues.
Developing system to warn you automatically about things to pay attention to.

Look/listen for event within 72 hours.  10 minutes to find out what it was.
- Simulation of station bombing.
Minute by minute description of event.
1.5 billion messages.

  • Linguistic issues
    • Alternative language
    • Negatives
    • Conditional statements
    • Hope/prayer statements
    • Irony/sarcasm
    • Ambiguity
    • Unreliable capitalisation
    • Data sparsity

Four things when monitoring:

  • What
    • Identify, classify, cluster
      • Events and sub-events
      • Involved entities
  • Who
    • Human or not?
    • Bots can be beneign, but many are a serious risk.
    • Bots that pretend to be humans.
  • When
  • Where


Big problem - people tweet crap!
People don't realise when people nearby are in danger.


Deception on social media

False crowdsourcing political support on social networks.
Smear campaigns using bots.
Bots to foster / prevent social unrest.


Identifying bots

23 behavioural features.
Feature set is open.
Recognise 90% of bots - more than humans can do.



Very small amount of tweets are geolocated, it's useless.
Have to use the text.

Timestamp is not necessarily correct.


Issues in events

No infrastructure (eg. at music festivals).
Phone signal issues, phone charging issues.

Most tweets from outside event.

Conclusions

Need to convince citizens that authorities are not spying on them.
Need to convince authorities that citizens are not all criminals.

Privacy and legality issues.

Creating a company on this research would be unethical.
Need to pass the right message.  Full disclosure.  Non-intrusive use of tweet content.

What happens when authorities demand this technology for privacy-invading stuff.

Have to be careful with what you publish.
Always assume the bad guys have thought of what you thought of.
Always be in a situation where you can destroy your data at short notice.
Bit legal barrage behind them.  Know what they are/aren't allowed, know what they do/don't have to do.
Start leading a blameless life.

Wednesday, July 10, 2013

#SSSW2013: Practical semantics and human nature

Harith Alani talked about using semantics to solve problems around evaluating the success of social media use in business.  The SIOC ontology is widely used to describe online community information.  It's not as simple as measuring someone's engagement with a brand's online presence - people are 'likeaholics' on Facebook, so you have to look at someone's whole behaviour profile to judge whether their like means anything or not.  It's no good just aggregating your data and spewing out numbers - you have to browse the data and try to understand where it came from.

He mentioned how little work has been done in classifying community types.  Most of the work that has been done seems to be with social networks internal to an organisation.  A bottom-up approach to community analysis can handle emergent behaviours and cope with role changes over time.  Looking at behaviour categories and roles can help an organisation to decide who to concentrate on supporting and how in order to sustain the community.  The results they have seen so far suggest that a stable mix of the different types of behaviours are needed to increase activities in forums - but they don't know what causes what.  They're reaching a point where they can use their behaviour analysis to guess what's going to happen to a community: how long it will last, how fast it will grow, how many replies a certain type of post is likely to get, etc.

Next they want to be able to classify community types, and be able to look at activities within a community over a period of time and automatically discover what kind of community it is; it might be something different than what it was set up for.

They created an alternative Maslow's Hierarchy of Needs to correspond with activities seen on forums, and found that most people are happy to stay at the lower levels of the hierarchy.  For example, join a community, lurk for a bit, ask one question and leave.  Not everyone wants or needs to be a power user.

Papers are being written that find patterns in individual datasets for a particular community in a particular context.  Harith and his team are getting tired of this; they want to generalise across communities.  So they took seven datasets and looked at how the analysis features differed as well as comparing the results across community types, randomness (vs. topicality) of datasets, and compared similar experiments.

Upcoming work includes the Reel Lives project, in which UoE is involved.  They're taking media fragments - photos, videos, audio clips, text recorded as audio - and creating automated compilations to tell a story.

Another is social methods to change energy consumption behaviour.  LiSC in Lincoln did something in this area back in the day.. an app that posted that you were listening to an embarrassing song on your facebook feed if you left your lights on.

Notes from Harith's talk are here.



From Tommaso Di Noia's talk, I learnt that recommender systems have a lot of maths behind them, especially for evaluating things, and reinforced something I already knew: I don't maths good enough to be taken seriously by most of the Informatics world.  I think I understand the principles behind the maths, but when something is descried in just maths, I have no idea what it relates to.  I'll work on this.

Real world recommender systems use a variety of approaches, including collaborative (based on similar users' profiles); knowledge-based (domain knowledge, no user history); item-based (similarities between items); content-based (combination of item descriptions and profile of user interests).  Linked Open Data is used to mitigate a lack of information about entities, and helps with recommending across multiple domains.  You do have to filter the LD you use before feeding it to your recommender system though, to avoid noise.  Notes here.

Tommaso's talk was followed up by a hands-on session, where we got to poke about with some of the tools he mentioned, including FRED (transforms natural language to RDF/OWL); Tipalo (gets entity types from natural language text); and using DBpedia to feed a recommender system.

Then we worked on our mini-projects for the afternoon.  We made some progress towards breaking down the concept of serendipity and working out what properties we might need to represent as linked data, and how we could observer a user and work out if/when/how they were having serendipitous experiences without intruding too much.

In the evening we took a coach to 'nearby' historical town Segovia.  Apparently an extremely motion-sickness-inducing two and a half hour coach journey around twisty mountain paths is 'nearby'.  Fortunately I was distracted from this horrible journey by a conversation with Lynda Hardman, which I wish I had recorded.  Lynda challenged various aspects of my PhD until I could explain/justify them reasonably, including:

  • Why digital creatives? (I'm used to that one now).
  • What is the outcome?
  • Why Semantic Web for this?

She also recommended a number of resources, including theses of her recent former students to help me with a structure for my own, and advice on maintaining a healthy balance between thinking and doing.

Plus she used to live in Edinburgh, more or less across the road from where I live now.  Cool.  Thanks Lynda!  You haven't heard the last of me :)

#travel

Once we got to Segovia, we had a guided tour of the ancient Roman architecture, interesting building façades and local legends.  It was a very good tour, but too hot to really focus.  Then they took us to a restaurant for a local speciality.  I was all set to write a whole individual blog post surveying the barbaric nature of human beings, but I didn't do it straight away and now the passion has faded slightly, so I'll leave it at a paragraph.  Some people watched the local 'ceremony' out of morbid curiosity I imagine, but it was the fact that so many people took so much pleasure in the idea of violently hacking up bodies of three-week-old piglets that really bothered me.  Fortunately the surging standing crowd allowed me (and only one other) to inconspicuously sit it out.  The veggie option was tasty, but it was difficult to really enjoy the rest of the evening whilst wondering vaguely about the states of minds of most of the people I was sharing a table with.

[Notes] Tommaso Di Noia at #SSSW2013

Tools and Techniques

Recommender systems

Input: Set of users + set of items + rating matrix.
Problem - given user, predict rating for an item.

In real world, recommendation matrix data is sparse.

Can use hybrid approaches.

Collaborative RS:

  • Like Amazon.
  • Based on other users with similar profiles.
  • Experimentally better than content-based, but you don't always have many users.

Knowledge-based RS:

  • No/little user history.
  • Based on domain knowledge.

User-based collaborative recommendation:

  • Pearson's correlation coefficient - baseline.
  • Imagine millions of users - computing similarities takes a lot of time.
  • So ..

Item-based collaborative recommendation:

  • Focus on items not users.
  • Compute similarity between each pair of items.
  • Don't have to compute similarity between items that don't have overlapping ratings.
  • Cosine similarity / adjusted cosine similarity (taking into account average rating related to a user to eliminate some bias).

Content-based RS:

  • Based on description of item 
  • and profile of user interests.


  • Items are described in terms of attributes/features.
  • Finite set of values associated with features.
  • Item representation is a vector.
  • Don't necessarily have complete descriptions of items - just have a 0 in your vector.


  • Similarity between items: 
    • Jaccard similarity.
    • Cosine similarity and TF-IDF (term frequency - inverse document frequency).
    • Batch compute similarities offline, then use similarities to compute ratings on the fly based on user profile.


  • Predict rate only for N nearest neighbours of items in user profile, that are not in the user profile.
  • An item is worth rating if more than x of N number of neighbours are within user profile.


Using LOD

To mitigate lack of information/descriptions about concepts/entities.

Recommender systems are usually vertical, but LD lets you easily build a multi-domain recommender system.

To avoid noisy data, you have to filter it before feeding your RS.

Freebase.


Tiapolo

  • Automating typing of DBPedia entities.


Vector space model for LOD

  • MATHS.



[Notes] Harith Alani at #SSSW2013

Social Media Analytics with a Pinch of Semantics

Using semantics to solve problems (not solving problems of semantics).

SM for businesses:

  • Analytics.
  • How to measure success?

SM silos impeding progress.
In-house social platforms increasing, so even more so.

SIOC to integrate online community information.
SIOC + FOAF + SKOS.

FB Graph.
People are likeaholics.  Their 'likes' become meaningless, so you need to take this into account when making recommendations.
Browse your data and understand user actions.

Behaviour analysis.

Bottom-up analysis.
Can handle unexpected or emerging behaviours.

  • Community members classified into roles.
  • Identify unknown roles.
  • Cope with role changes over time.
  • Clustering to identify emerging roles.

eg. focussed novice; mixed novice; distributed expert; ...
Spectrum across users you can or can't do without.

Extending an ontology built on SIOC.

Encoding rules in ontologies with SPIN.

Three categories of features:

  • Social features (people you follow, people follow you, ...)
  • Content features (what you're posting, keywords, ...)
  • Topical/semantic features

Which behaviour categories you need to cater for more than others?  How roles impact activity in online community.

Consistently see that you need some sort of stable mixture of behaviours for activities in forums to increase.

==> Don't know what's causing which.

What is a healthy community?

Use behaviour analysis to guess what's going to happen to community. Eg.

  • Churn rate.
  • User count.
  • Seeds/non-seeds prop (how many / if people reply to you).
  • Clustering.

Unexpected: the fewer focused experts in the community, the more posts received a reply.
(But quality of answers?)

Community types (Little work in this space)

Muller, M. (CHI 2012) community types in IBM Connections:

  • Communities of Practice
  • Teams
  • Technical support
  • ..
  • .. (see slides..)

Need an ontology and inference engine of community types.
Wants an automated process to tell you what type of community it is - it might be something it wasn't set up for.
Then you could determine what sort of patterns you would expect to find.
Noone has done this yet.

Measurements of value and satisfaction

Answers different across communities.  They ran it on IBM Connections - corporate community.

Most of this work is for managers of communities - see what's happening and help to predict what might be coming next.

Can classify users based on Maslow's Hierarchy of Needs?
Mapping the hierarchy to social media communities.
~90% users happily staying at the lower levels of the 'needs hierarchy'.

Behaviour evolution patterns

What paths they follow over time.
eg. people who become moderators eventually.

Engagement analysis

What's the best way to write a tweet so that people care about it?
Which posts are likely to generate more attention?

Getting bored of people finding patterns in individual datasets.  What can be generalised to other communities?

So experimented with 7 datasets and looked at how results differed across:

  • community types.
  • randomness (vs. topicality) of datasets.
  • related experiments.

And people use different features.

Semantic sentiment analysis in social media

Too much research going on, especially on twitter.

Extract semantic concepts from tweets; likely sentiment for a concept.
Tweetnator.
Semantics increases accuracy by 6.5% for negative sentiment; 4.8% for positive sentiment.

OUSocial.
Students don't use in-house networks because they already use facebook groups etc. Want to analyse what's happening on them.

Upcoming

Reel Lives (inc. Ed.)
Fragmented digital selves.
Want to automate compilations of media (photos, messages) posted online.

Changing energy consumption behaviour.
Providing information is not enough.

Social Eco feedback technology.

Tuesday, July 09, 2013

#SSSW2013: Collaborative ontology engineering and team formation

We were introduced to the various mini-projects on Tuesday morning, and encouraged to form teams with people who weren't from the same university.  I quickly shortlisted the five that sounded most interesting to me, but was disappointed that there weren't any about multimedia.  Because how to evaluate a very subjective system is a potential problem for me, the project proposed by Valentina Presutti was my first choice:

"Serendipity can be defined as the combination of relevance and unexpectedness: an information is considered serendipitous if it is at the same time very relevant and unexpected for a given user and in the context of a given task. In other words, a user would learn new relevant knowledge. To evaluate the performance of a tool (e.g., an exploratory search tool, a recommending system) in terms of its ability to provide users with serendipitous knowledge is a hard task because both relevance and unexpectedness are highly subjective. This miniproject focuses on two main research questions: what is the correct way of designing a user-study for evaluating an exploratory search tool performance in terms of serendipity? Is it possible to build a reusable set of resources (a benchmark) for evaluating ability to produce serendipity, allowing easier evaluation experiments and comparison among different tools?"

Nobody else seemed to be interested though, so I resigned myself to not being able to do it... until I explained the project and why it was interesting, to the best of my ability, to Andy, Oscar and Josef, and they were sold enough to mark it as our first choice.  Thus Team Anaconda Disappointed (a name of significant and mysterious origins) was born, and Project Cusack (because of the movie Serendipity, which nobody got) was underway.

Our first lecture today was from Lynda Hardman, about telling stories with multimedia objects.  It was super relevant to what I'm doing, to the point where I'm surprised I hadn't come across her work already.  My notes are here.  Lynda has done, for example, work with annotation of personal media objects like holiday photos in order to combine them into a media presentation.  She has considered similar things to me, in particular noting that there are many many aspects of data about multimedia - I had assembled my take on this into a Venn diagram for my poster..



One I hadn't considered is annotating an explict message of a piece of media, intended by the creator.  This isn't always relevant - sometimes the consumer's interpretation of the media is more important - and this in itself might be an interesting annotation problem.  Competing perspectives - something an ontology should be able to represent.

I need to check out COMM - Core Ontology for Multimedia.

She has an overview of the canonical processes they have consolidated the process of producing digital content into, and how annotation can be formed around these.

Lynda also told us about Vox Populi and and LinkedTV; practical applications of annotating multimedia.

I made lots and lots of notes.

Natasha Hoy gave us some insights from the biomedical world with regards to ontology development, particularly in relation to the International Classification of Diseases which, when last revised in the 80s, consisted of a lot of paper and a whoever-shouts-the-loudest algorithm for inclusion of terms.  But the next version, currently under creation, is being developed with a version of Web Protege, customised to be friendly for those who don't know or care about ontologies, and is a truly collaborative process (for those allowed to take part) with accountability for all changes.  It's open too though, so even those without modification rights can view and comment on the developments.  My notes are here.

Lunch was for the first time outside, under the shadows of the forest, and for me was a tray of tomatoey vegetables that were delicious but few.  A striking contrast to Monday's lunch.  Everyone else had some meat-potato combination, preceded by a salad with tuna, and followed by a peach.

The hands-on session followed on from Natasha's talk.  We teamed up (temporarily Anaconda Hopeful) and played with Web Protégé.  There were two magazines and two newspapers, each with four departments.  Anaconda Hopeful were randomly designated the Advertising Department of Iberia Travel (a food and travel magazine).  We got stuck in, on paper first to identify some classes and relations that were relevant to us, and then with Web Protégé, along with the other departments of Iberia Travel.  We didn't come into any conflicts, but ended up creating a few classes that we needed, but should really have been the remit of another department (I guess we just got there first).

Then it was announced that Iberia Travel had bought the other magazine (and one of the newspapers had bought the other), and we had to work together to merge ontologies with the other department.  It became apparent that the other magazine had never had an Advertising Department (no wonder they went under!) so we had no-one to attempt to merge ontologies with.  We attempted to sell our expertise to the Advertising Departments of the newspapers, but there were already too many people involved in the heated debate that came out of the ontology merging there, so we couldn't really get involved.

Later we got cracking with our mini-projects.  Valentina showed us aemoo, and the experiments her team had come up with to try to evaluate it.  We sat down by ourselves to brainstorm, describing a lot of concepts for ourselves, breaking down the notion of serendipity, figuring out what might be wrong with existing experiments to 'measure' serendipity, and collating literature in the area.  (Turns out there is a lot, and it's a very interdisciplinary issue; lots to read about from social sciences, anthropology etc, as well as philosophy of science.  In computing, it seems to be primarily discussed within the realms of recommender systems and exploratory search).

Serendipity seems to be mainly described as a combination of unexpectedness and relevance.  Problems include the sheer subjectivity of it.  Some people are going to get excited by all facts they find out, whether they're useful or not.  Some people are going to have hidden, inexplicit or subconscious goals that affect how 'relevant' something is to them.  People describe their different areas of expertise in different ways; some are more humble than others and would not call themselves an expert in a topic, for example.  So whether or not an event can be considered a serendipitous one is a complex question, which must take into account the person's background, goals and existing knowledge, the task they are trying to achieve (or lack thereof, as serendipity is particularly important - in my opinion - in undirected, loosely-motivated activities), the way they are able or encouraged to interact with a system, what they are doing before and after... all these things make up a context for someone's activities, and none of them seem to be particularly measurable.

Dinner was a vegetable and potato (yay!) starter, followed by spaghetti in tomato sauce (fish for everyone else, although Andy got a custom omelette, lah-de-dah).  Also an apple.  We learnt the hard way not to sit at a table directly underneath a light, as the bugs just raiiiin down.

After dinner we crowded around Enrico who had offered to provide advice about PhD-ing.  From this session, I have a signed diagram of the life of a PhD, because he borrowed my notebook to make it.  I tuned in and out of the discussion, and noticed some irregularities between my PhD and what seemed to be 'normal'.  For instance, most people didn't seem to have as much control over their topic, or what they were doing at any given moment in their first year.  I am really, really enjoying my freedom, but in order to justify that I deserve it I need to sort out my lack of direction and focus.  I need to believe in what I'm doing - not be told by someone else - which is one of the main reasons I am doing this particular PhD.  Perhaps I need to ask for more guidance to more quickly reach the necessary conclusions for myself.  (And, of course, perhaps I also need to stop taking big chunks of time out periodically for different reasons; that might speed up the process as well).

Later, the overriding sentiment was that the job of a PhD student was to answer a question, to produce a theory.  Not to create a system or solve a large problem; certainly not to worry about practical, real-world applications of theories.  Well, I've already explained that this is something I can't accept, and I still am not convinced that that is going to impact on my ability to do a PhD.  Theories develop during practice.  Coding and designing, like writing, are part of my thought processes, and I reach realisations or find new questions to ask through hacking and playing and making.  And why would I be hacking and playing and making, if not to try to produce something of real-world value?  If my motivation in making a system is explicitly to come up with new theories, then my approach and outcomes and realisations will be entirely different.  In trying to make something that works for real people, not researchers in a restricted domain or specific context; a clean and sterile laboratory, I figure out different things, that matter.

There was another discussion that I came into a bit late, but it sounded like a very harsh discussion about problems with research in industry (rather than academia) that seemed to be very overstated compared to what I have read and experienced myself.

By the end of the day, it felt like I'd been at Summer School for weeks, and had known everyone forever.

[Notes] Natasha Noy at #SSSW2013



Stanford, Protégé.

In past 10-15 years, through collaboration with scientists (particularly biomed), ontologies have become essential.

Don't need to sell ontologies to scientists, they believe in it.

Focus on science because that's where she has experience etc.

We're not so bad at versioning ontologies, more versioning data is the problem.

Experts add stuff, curator checks quality, and publishes upcoming tasks.

Similar to open source developments, but no research to compare the two.
- Different because biomed people are paid (well).

ICD - International Classification of Diseases.
  • Started 17th century.
  • Causes of death, medical bills, policy making.
  • Revised in 80s over 8 annual conferences.
    • 17-58 countries, 1-5 person delegations, mainly health statisticians.
    • Manual, on paper.
    • Whoever shouted loudest..
    • Paper copies, only English, pdf.
  • ICD-11 - OWL ontology!
    • Open, Protégé (a customised, Web version), links to others.

Conflict resolution:
  • People naturally don't step on each others' toes.
  • Users expect stuff like Web 2.0 interactions, Web interface.
Web Protégé:
  • No consistency checking - coming but currently must go offline.
  • Ontologies are solution to everything - versioning, roles, social interactions.
  • Also plugins are the solutions to everything - visualisations.

[Notes] Lynda Hardman at #SSSW2013

RELEVANT.

Users (consumers?):

  • Finding content
  • Media types * mostly text at the moment, little integration of different types
  • Specific tasks - not much connection of results with user tasks.

More data than just what you seen in the media (cue my Venn diagram).

Plus, eg. paintings - lots of 'cultural baggage'.

Care more about the story than the media.
Interpretation by end users.  Hopefully message that the author intended.

Meaning of combination of assets.
eg. Exhibition of artists work.

Interacting further with the media.

  • Search - serendipitous or focussed around a theme (or both).  Different search goals.
  • Sharing, passing it on.

(SW and multimedia community need to work together).

-> Raphael Troncy on Friday - attaching semantics to multimedia on the Web.

Need mechanisms:

  • to identify (parts of) media assets.
  • associate metadata with a fragment.
  • agree on meaning of metadata.
  • enable meaningful structures to be composed, identified and annotated.

Workflow for multimedia applications

  • Canonical processes of media production
    • Reduced to the simplest form possible without loss of generality.

Heard of MPEG-7? Don't bother.. very much from a media algorithms perspective.

Applications:

  • Feature extraction.
  • News production.
  • New media art.
    • An interactive exhibit that responded to audience present.
  • Hyper-video.
    • Linked video.
  • Photo book production (CeWe).
    • (Using this example for explaining processes).
  • Ambient multimedia systems with complex sensory networks.

Canonical processes overview...

There's a paper.

CeWe photobook - automatic selection, sorting and ordering of photos.
Context (timestamp, tags) analysis and content (colours, edges) analysis.

Things from these you want to represent your digital system (ie with LOD):

  • Premediate, eg.
    • remember to take your camera on holiday.
    • write scripts, plan shots.
    • place a security camera in the right location.
  • Construct Message (not really in the chain, appears all over the place); what to conveny with media? Intention? eg.
    • show people a great holiday.
    • sell a product.
  • inform/advise.
  • Create (method of creation might be important, so record in metadata), eg.
    • take photos.
    • make video.
  • Annotate, eg.
    • automatic or manual.  Stuff that is embedded by device vendors (but there's so much more...)
    • domain annotations: landscapes/portraits, timestamps, face recognition.
  • Publish, eg.
    • compose images into photobook.
  • Distribute, eg.
    • print photo book and post.
    • cyclic processes online.


COMM - Core Ontology for Multimedia.

Premediate and construct message - human parts, she doesn't expect them to be digitised any time soon.

Using Semantics to create stories with media

Can we link media assets to existing linked data and use this to improve presentation?

How can annotations help?

  • What can be expressed explicitly?
    • Message (somewhere between a html page and poetry).
    • Objects depicted.
    • Domain information. <--- li="">
    • Human communicaiton roles (discourse). <--- li="">

Vox Populi (PhD project)

Traditionally video documentary is a set of shots decided by director/editor.
vs.
Annotating video material and showing what the user asks to see.

interviewwithamerica.com

Annotations for these documentary clips:

  • Rhetorical statement; argumentation model (documentary techniques).
  • Descriptive (which questions asked, interviewee, filmic).
    • Filmic: continuity like camera movements, framing, direction of speaker, lighting, sound - rules that film directors know.
  • Statement encoding (eg. summary what the interviewee said):
    • subject - modifier - object statements.
    • Thesauri for terms.
    • Can make a statement graph, finding which statements contradict and which agree.
    • (He encoded this stuff by hand - automated techniques aren't good enough).
    • Argumentation model - claims, concessions, contradictions, support.


Automatically generated coherant story.

  • Are we more forgiving watching video? (Than reading these statements as text).  Peoples' own interpretations strongly affect understanding of the message.


Vox Populi has (not for human consumption) GUI for querying annotated video content.

User can determine subject and bias of presentation.
Documentary maker can just add in new videos and new annotations to easily generate new sequence options.


User informatio needs - Ana Carina Palumbo

Linked TV.  Enhancing experience of watching TV.  What users need to make decisions / inform opinions.

  • Expert interviews (governance, broadcast).
  • User interviews - what people thought they need (215 ppts).
  • User experiments - what people actually need.

Experiment - oil worth the risk?

  • eg. people wanted factual information from independent sources; what the benefits are; community scale information.


Published at EuroITV.

Conclusions

  • We can give useful annotations to media access, useful at different stages of interactive access (not just search).
  • Clarify intended message. Explicity with annotations.
  • Manual or automatic.
  • Media content and annotations can be passed among systems.
  • No community agreement in how to do this. <--- li="">
  • How to store?

Questions

Hand annotations are error prone - how to validate?
Media stuff - there can be uncertainty, people don't always care.

Motivating researchers to annotate...
Make a game.

Store whole video or segements?
W3C fragment identification standards - timestamps via URLs.

Monday, July 08, 2013

#SSSW2013: Research in theory and practice, and where on earth am I?

The 10th Summer School for Ontology Engineering and the Semantic Web

Sunday

Arriving by train into Cercedilla, north of Madrid, we immediately encountered other confused looking folk with poster tubes.  So we shared taxis (EUR 10) from Cercedilla station to the summer school residence further north, in the forest.

After getting keys for our pleasant, single, en-suite rooms, arrivals congregated in the shade by the building  to introduce ourselves.. Again, and again, and again, as new people continuously arrived over the space of a few hours.

A really broad mix of people are here in terms of nationalities and places and levels of study, but I still haven't quite got used to the fact that answering 'Semantic Web stuff' is not specific enough in this crowd, when someone asks you what your research is about.  Nobody needs convincing that these technologies are useful!

Later we received schedules, maps, ill-fitting t-shirts* and very helpful name badges, and headed for dinner at the bar down the road.

As is traditional when I write about my experiences in new places, I will describe the food every day.  It has become apparent, at this residence at least, that variety of ingredients is not ordinary, so in this respect meals are simple.  Dinner that first night started with a salad (lettuce, olives, tomato, onion, shredded beetroot and a single slice of hard boiled egg; no dressing), followed by - for the majority - slices of meat (beef? Pork? I dunno..) and fries.  Mine was a plate of mushy green vegetables with a little seasoning, that was pretty tasty.  Dessert was a single pear, delivered with ceremony, but otherwise unadorned.  Healthy, at least.

Yet we were all (those I sat with at least) were left feeling a little unsatisfied.

I shared a table with a French, Spanish, Italian and Irish guy.  Conforming appropriately to stereotypes, and setting up reputations for the rest of the week, the French and the Italian shared the bottle of wine on the table; the rest of us went without.

I returned to bed after a couple of hours of socialising and enjoying the cool air in and around the bar.

* For next year, they could ask for t-shirt sizes when they ask for dietary preferences?

Monday

The day started early, and with no hot water or wifi for anyone.  Breakfast was combinations of sweet pastries, coffee, tea, juice and bread.

Punctuated variously by coffee breaks, the learning began in earnest.

During the introduction by Mathieu D'Aquin, I found out that I am one of 53 students selected out of 96 applicants to attend this year's Summer School of the Semantic Web!  I had no idea it was that selective, or that there had been that much competition.

The first keynote was by Frank van Harmelen, about all the Semantic Web questions we couldn't ask ten years ago.

Slides:



Frank started by saying that the early Semantic Web vision has morphed into the more manageable vision of a Web of Data, or a Giant Global Graph, and outlined the principles of the Semantic Web as they appear to stand at present:

1. Give everything a name (entities).
2. Relations form graph between things.
3. Names are addresses on the Web (so we inherit properties of Web like AAA).
4. Add semantics.

Frank pointed out the advantages of the fact the Linked Data crowd, grown naturally and not designed, is now so big we don't know how many triples it contains, nor how fast it is growing.  Companies and organisations (like Google, NXP, BBC, DataGov) are using Semantic Web technologies to achieve their own ends, for a variety of different use cases, without caring much about the Semantic Web, and this is contributing to the growth.

This growth has given rise to a number of research areas that were impossible to realisitically ask questions about ten years ago, including self-organisation, distribution of data, provenance, dynamics and change, errors and noise (how to deal with disagreements).

Frank asserted that rules and structures, algorithms and patterns in data, exist whether we are looking at them or not.  He used the analogy that OWL is our microscope, and it may be the tool that distorts our vision of the information universe rather than properties of what we are looking at (for example, structures in data presenting themselves well in some domains but not others).

He went on to promote the roll of the Informatician to be to test theories, hypothesis and falsify, as scientists rather than engineers.  To discover, rather than build.

I struggle with this view of the world, and feel instinctively that theory and practice are intrinsically linked; one can't exist without the other, not just in the grand scheme of things, but in day to day work and research.  This is one of the main points of contention with my own PhD, and I've no doubt there will be many more blog posts about this issue in the near future as I reconcile my need to create something immediately useful with the necessity of producing a contribution to knowledge at large.

See my raw notes here.

We had an Introduction to Linked Data by Mathieu D'Aquin (raw notes here), followed by a workshop.  We wrote SPARQL queries to populate a pre-written web page with information about Open University courses, sub-courses and locations thereof.

Lunch, similar to the previous night's dinner, was a starter salad, an entire half chicken (or something) plus fries for the carnivores and the most unappealing risotto of my life for (not that I'm ungrateful, but I have never been unable to finish a meal due to boredom before).  I went for a walk with some others to grab some fresh air before the afternoon's work, and missed out on watermelon.

Manfred Hauswirth presented some really exciting stuff about annotating and using streams of data.  Particularly challenging is how to integrate this with static data and make inferences over the lot.  Streams include sensor data, as well as ever-flowing social media streams for example; anything that changes over time.

They've built some systems to process this kind of data, and one of them is available as middleware.

My raw notes are here.

In the afternoon we had a poster session, where all participants pinned up posters about their work, and discussed at length with anyone who was interested.  Here's evidence that I participated.


And here's Paolo's:



I wrote a few notes about things from other peoples' posters that I need to look up.

The main feedback I received was about making sure I focus, narrow down my topic, and concentrate on some evaluatable deliverables that are PhD-worthy.

Questions like (paraphrasing) "why should we care about digital creatives?" threw me, because I thought the obvious answer - that they are people too, Web users, technology users, contributors to culture and an ecosystem of digital content and data - was apparently not enough from an academic standpoint.

I was simultaneously told to focus more, and to explain why the problem I'm trying to solve is applicable to all domains, not just digital creatives.  But some of the problems I'm looking at have been (or are being) solved in other domains (like e-health, biological research, education) and the reason what I'm doing is interesting is because none of these solutions quite work for digital creatives, and I want to find solutions that do, and try to figure out why.

I'm still stuck in some sort of struggle between theory and practice; thinking and doing.  And the long-standing problem of how to decide which doing actually worked.

I've started scribbling notes about the narrowing down problem.  I'll need to have this figured out before my first year review in August anyway, so stay tuned for another post all about it.

Then I sneaked off for a nap.

Dinner at the bar again; the usual salad, plus some eggy fish thing for most.  I got a plate of artichoke.  Artichoke is great, I love it, and I'm all for simple meals.  But I remain unconvinced that a plate of only artichoke constitutes an acceptable level of effort on the part of caterers.  And the sheer quantity made it start to taste a bit funny after a while.  But not to worry; we rounded off with a solitary peach apiece.

Further socialising, and appreciation of the night sky, before returning to bed write blog posts.

I'm super excited and inspired by the talks, work I've heard about so far, and the atomsphere of the place.  I'm excited to learn a helluva lot, and remind myself that I'm not facing impossible problems, and am not facing many problems alone.  I remember that I am instinctively passionate about the Web and the possibilities it holds (and indeed has already realised) for the empowerment of individuals.  I remember how lucky I am to be able to sustain myself through studying something I love so much, and to have the potential to make a change, and through my work maybe even facilitate others to be able to make a living doing what they love, as well.

[Notes] Poster session at #SSSW2013

Things to investigate further!

LibRDF - linkeddata-perl for Debian by Kjetil Kjernsmo.

Rakebul Hasan, Fabien Geandon (? not sure about names, can't read my handwriting..) - Trustworthiness of inferences.

Taldea - fostering spontaneous communities.
Ghada Ben Nejma.

NERD ontology for spotting entities.
nerd.eurecom.fr
See photo:


[Notes] Manfred Hauswirth at #SSSW2013




Streams: Any time dependant data / changes over time.

Has done a paper about P2P stuff.

Data silo - "natural enemy of SW scientists"

Massive exponential growth of global data.

Still have to integrate dynamic data with static data.
Multiway joins are domintion operator.  Need to be efficient.

Everything/body is a sensor.

Various research challenges:

  • Query framework.
  • Efficient evaluation algorithm.
  • Optimise queries.
  • Organisation of data.

CoAP ~= http for sensors.

Stuff about sensor networks and context - useful for Michael.

  • Common abstraction levels for understanding.
  • SSN-XG ontology
    • Application: SPITFIRE
  • You can buy a sensor off the shelf that runs a binary RDF store and can be queried.  So possible to use SW tech with resource constrained devices.
  • RESTful sensor interfaces stuff being standardised - CoRE, CoAP.
  • Linked Stream Model
  • CQELS-QL (extension to SPARQL 1.1; already legacy)

Rewrite query to spit out static and dynamic - lots of overhead.
But need to optimise between these.
Neither existing stream processing systems nor existing databases could be efficient enough.
So the built own LD stream processing system.  (Optimised and adopted existing database stuff).

HyperWave - didn't succeed.  Didn't listen to customers and wasn't open source (license fees).
But better than hypertext was back in the day.
Performance important for success/uptake.

Just putting it on cloud infrastructure doesn't mean it scales.

  • Need to parallelize algorithm.
  • Took it to a point where adding more hardware did help.
  • Problems!  Inconsistent results, engines don't support all query patterns.. very early, don't fully understand yet.
  • Long way to go.  How to prove what is a correct result?
  • Needs to be easy to use - dumb it down.
    • Linked Stream Middleware (available):
      • Flights, _live trains_ - SPARQL endpoint!, traffic cams.
      • SuperStreamCollider.org
      • Current Tomcat problem with twitter streams.

To do?

  • Scaleability
  • Stream reasoning (only processing, pattern matching, so far.  Want to infer conclusions).

World is:
... uncertain, fuzzy, contradictory.
So combine statistics and logics.
Hard to scale logical reasoning, so use statistics to shoot in the right direction.

Privacy?

  • Build systems! Can't do thought experiments about the Web.

Don't get hung up on approaches / labels.

[Notes] Introduction to Linked Data at #SSSW2013

(by Mathieu D'Aquin).



Linked Data = universal connections, like Lego.

Universal is why it's important.

Workshop instructions.

The only problem we had during the workshop was disagreement about how to read the 'broader' and 'narrower' relations between courses.  It instinctively ready contrary to what (my) common sense suggested (eg. that 'arts and humanities' is broader than 'history', which some people disagreed with).  A quick reference to the ontology documentation resolved that.

[Notes] Frank van Harmelen at #SSSW2013


Semantic Web & Web of data = a more manageable mission.
Metaweb movie - got bought by Google and incorporated into Knowledge Graph.

SW Principles:
1. Give everything a name (entities).
2. Relations form graph between things.
3. Names are addresses on the Web (so we inherit properties of Web like AAA).

This becomes Giant Global Graph.  (Maybe SW should be called Giant Global Graph?)

4. Add semantics.

  • Types of things, relationships.
  • Hierarchy, constraints:
    • Inferences.  Bounding shared beliefs by sharing ontological information.  Space for confusion gets smaller and we begin to agree on interpretation of information.
Semantics = predictable inference.

Google: from just links to results, to information boxes (last May).  Can't directly address Google Knowledge Graph.
NXP (microprocessors): 26,000 products. Integrated all databases into triplestore.  Exposing subset of triplestore to customers.
BBC: 125 million triples.  Many data sources.  APIs to website.  Own ontologies.

All have the same triple-layer architecture:

Raw data
   |
SW layer
   |
Output / API / UI etc

DataGov: eg. air quality in cities, campaign money, if policies work.

Companies don't care about SW, but are using these technologies for their own IRL purposes.

These are all different types of use cases of SW technologies:

  • search;
  • data integration;
  • content re-use;
  • SEO;
  • data publishing.

It's important that the SW graph is so big.

  • More questions to ask.
  • Good that we no longer know how big, or how fast it is growing... Tens of billions of facts.
    • How many are really permanent?
    • Some are stable, some will disappear - just like the 'regular' Web.
      • "...it being a mess is the only reason why it scales."
We need to get used to the idea of SW being a mess - aka "a system so large you can no longer enforce central control" (complex system).

The LD cloud is still poorly interconnected, but good graph properties.

SameAs.org

Heterogeneity is unavoidable.
Socio-economic, first to market - why certain systems/ontologies get used, eg. schema.org, dbpedia.

Self-organisation.
LD cloud grew, nobody designed it.
Knowledge follows power curve.  This has an impact on mapping and reasoning, storage and indexing.

Distribution.
Web not geared for distributed SPARQL queries.  Everyone pulls in all data and queries local copy.  Not very 'webby', disadvantageous.  So subgraphs?  Query planning?  Caching?  Payload priority?

Provenance.
Representation, (re)construction.  Metametadata (knowledge about knowledge; uncertainty; problems with vocabs for this).
How to get from provenance to trust.

Dynamics (change).
Cool Web in 60 seconds graphic.
SW not changing this fast, but soon..

Errors and noise.
Sometimes we disagree.
Deal with by: avoid, repair or contain.  Or just deal with it - allow argumentation.
Fuzzy, rough semantics - almost, maybe.

Lots of research questions.  But not ones we could ask 10 years ago.

Information universe - "algorithms exist without us looking at them".

We should ask if things work in theory.
Scientists vs. engineers.
Discovering vs. building.
  • Is this incidental or universal?
OWL is our microscope.
We can see structure well in some domains, but not so well in others.  Maybe it's our tool that distorts, rather than a property of the domain.

Says we should change our mindset from building stuff to hypothesising and falsifying.

Sunday, July 07, 2013

Madrid: First impressions

I've never been to Spain, and for some reason the Spanish language baffles me.  It's the least guessable European language, in my humble opinion, and I can't get my head (/tongue) around the pronunciation.  I've never studied Spanish, but I can get by just fine with French and German, and when I've been in Switzerland, Holland and even Italy I could make a go of communicating to a reasonable degree, so I thought I'd pick it things up.

But apparently my brain is resistant to Spanish.

That aside, locals are very friendly and even when they don't speak English they don't seem to give you disparaging looks.

We arrived on Saturday evening (having successfully got our poster tube past various levels of EasyJet staff who would have been within their rights, if unreasonable, to tell us it was too big for hand luggage).  We were met at the airport by new friends, who we later accompanied into the centre of the city to watch the Gay Pride Parade.

The streets were packed, the heat was stifling, and the costumes varied and outrageous.  The party atmosphere filled the air with a tangible excitement.  The parade itself was slow to start, but eventually lasted for several hours.  We explored a little, taking in this version of the city as the empty streets of the Sunday morning to follow would have a very different feel.

But mostly we sat on the grass, chilling with our hosts and their friends, who were mixing cheap wine and lemonade and bobbing to the waves of techno, trance or cheesy pop that came by with every float.

The night was hot, but not unbearable; about 37 degrees, yet there was a slight breeze and no humidity which made all the difference.

On Sunday we braved the sun to do the tourist circuit recommended to us to take in as much of the city as possible in the time we had.

We deviated somewhat to explore various gardens (beautiful, although I was too distracted by the heat to really appreciate them) and pop into some impressive looking cathedrals (a nice break from the heat, but I find Christian art, architecture and interior decoration disturbingly morbid).

Never having thought about visiting Madrid before, and thus never having planned out what I might like to see when I got here, none of the monuments, buildings or squares stood out to me.  Architecture is pale and old-looking, and mostly very ornate.  Similarly pale sculptures, statues and fountains are in abundance.

Distinguishing vegetarian options in restaurants and cafes seemed more challenging than I had the energy for; not to mention, most didn't start serving food until at least one-thirty which wouldn't have allowed us time to catch our train, so I'm ashamed to say we grabbed snacks from a supermarket instead.

We (eventually) caught a train from Atocha station in the centre to Cercedilla (EUR 5.30).  And thus began the Summer School adventure... Continued in another post.

Week in review: Catching up, poster


Returned to Edinburgh on Wednesday.  Suffering from post-travel depression.  That's a thing, I looked it up.  It's a thing nobody has any sympathy for.  Life is hard.

Started and finished my poster for the Semantic Web Summer School.

Went to Spain for aforementioned summer school.  Spain is hot.  Let this mental week of learning begin!

Thursday, July 04, 2013

Post-Oz: Some conclusions

So I got behind with the blogging.  As in, I didn't do any.  I will rectify that over upcoming days!  But for now, I will reflect on my Pre-Oz quests.

1. Find the resting place of my Uncle David.

Achieved, debatabley.  I found the Herb Garden, South Yarra, which was part of Melbourne's amazing Botanic Gardens (and conveniently pretty much across the road from where Jamie lived).  However, it has never been legal to scatter ashes there, and there definitely aren't any headstones.  It doesn't mean he wasn't scattered there though.  So I took lots of pictures.  That's the best I could do.  It is beautiful.

2. Forge ties between Edinburgh and Melbourne Open Knowledge Foundations.

Well, I went to #govHack and had an amazing time.  Pete and I discovered we'd already been to one of Flanders' hacks in the UK, and another hack that had derived from it (Dev8D and DevXS).  So we were well received.  Also because we'd travelled 11,000 miles to be there.  And I got to know some things about Australian data.  Stay posted for the official blog.

Check out some of the places my Mum and her family lived.

I found a caravan park in Palm Cove where she lived 38 years ago, and the creek where she learned to swim!  The change in the town from how she described it was immense, but walking the same beach she walked as a child was a really surreal experience.  I also dangled my feet in Mossman Gorge in the Daintree rainforest before I even knew that she used to swim there.

I saw a sign to Toowoomba near Brisbane, where she lived too.

Maybe attempt to bump into relatives or family friends.

Wasn't expecting to achieve this.  But I saw my cousin in Perth!  That was awesome.  And it was very good of him to pick us up from the airport, give us a super fast tour of Perth at night and let us sleep on his floor for a few hours, before returning us to the airport for 5am.  Thanks Luke!

It almost made up for the ..ahem.. additional expenses incurred.

See Ayres Rock.

Check!  It wasn't all that.

But the road tripping was amazing.

See the main cities on the east coast, like Brisbane, Sydney and Canberra.

Check!  But only just.  Brisbane and Canberra need more visits.  I wrung quite a lot out of Sydney, made new friends (one of whom was feathered), ate some really tasty food, and learned to love the Domestic Airport.

Two freelance projects, due on 5th June and 1st July.

Succeeded the first (with some hiccups) and failed at the second (pending..).

Finish my PhD literature review...

Haha.  That was ambitious.  It didn't get a look-in.  I feel bad.  There was no wifi though!  Anywhere!

Make a poster for the Semantic Web Summer School.

I didn't do that whilst travelling, but it is done now.


Meanwhile, whilst you wait for my full blog posts, here are our tweets, nicely collated:


Monday, July 01, 2013

Experimenting with airport security

I have been on no fewer than thirteen planes in the last four weeks or so.  I've already made my apologies to the environment.  If it's any consolation, it makes my ears really hurt every time.  It's only four days, after that, until I get on a flight from Edinburgh to Madrid.  It's becoming as normal as going for a walk.  (A walk with earache and slight deafness).

Presumably, if using electronic equipment during takeoff and landing, and mobile phones at any time, was actually dangerous there would be some sort of machine to detect if any passengers did have things turned on, and electric shock them (or something).  And there would be a degree of consistency in how these things were handled between airports.  In quite a few airports in Australia, we were asked to turn our phones off altogether before stepping onto the tarmac to approach the plane.  I saw a couple of people who were walking and tapping their screens get pulled aside and refused boarding until their devices were safely off and pocketed.  No planes were crashing in the background.  And nobody checked that out-of-sight gadgets were switched off.  So you only pose a risk if you get caught?  And in Zurich, the crew were happy with people making phonecalls right up until the plane started trundling.

Security is inconsistent too.  And a massive hassle, so in an endeavour to minimise unpacking and re-packing time at the gates, I gradually reduced the number of electronics and liquids I removed from my hand luggage.

A timid and inexperienced flyer before this year, I used to think it was standard procedure to remove everything with a glimmer of metal or a drop of liquid for an x-ray in its own separate tray.  So out came the chargers, headphones, coins, cards and soaps, along with tablets, e-readers, laptop, phone and clear bags of 100ml containers of liquids.

Before the start of this trip I'd figured out that was excessive, and silently pitied those who unloaded all of these things still.

Time to see what else the security guys don't give two shits about.

Kindle.  That was fine.  Nobody questioned a Kindle left in my bag.

Tablet was the next to stay.  My Nexus 7 caused zero concern.

At this point I'm just unloading a laptop and liquids.

So I left the toothpaste in.  That was fine.  Then I found out one of my travel buddies had been inadvertently passing through security gates with a 250ml bottle of suncream in the side pocket of his rucksack.  Apparently not an issue.

At some point on the trip I acquired small scissors; so weak they could barely cut thread.  They made it through security in Sydney the first time just fine, but in Alice Springs my bag got pulled back.  I thought it was the toothpaste, but no.  Once they had confirmed that these scissors were such that they'd buckle before piercing skin, they let me have them back.  Same again in Cairns, and Gold Coast.  By Sydney round three, the scissors had broken into two parts.  A very concerned looking lady asked if she minded if she threw them away rather than let them on the plane.  I didn't.

By this point I was sending 50ml nasal spray and deodorant through, because what the heck?  They were also cool with our massive jar of pasta sauce, and whatever we happened to be drinking at the time.

So within Australia, I was just removing my laptop at security (and phone from my pocket).

In Singapore, signs actually advised that we keep phones, tablets and e-readers packed, so only the laptop came out there too (my liquids were in my hold baggage by this point).

Zurich, however, are nuts.

No liquids still, and I got out my Nexus and Kindle straight away because someone ahead of me was getting yelled at about an iPad.

Apparently that wasn't enough, and eager staff sent my bag back through three times, unloading more on each round.  Out came chargers, headphones, bags within bags, chocolate.  I was almost ashamed of how much crap I was carting around.  For the final round they dug deep to extract a a sealed bar of soap I bought someone as a souvenir from Cairns.  That seemed to do the trick.

So then I had to hold up the line re-packing.

How does x-raying soap on its own make it less of a threat?  Or easier to detect the threat?  Serious question.  Anyone know?

I'm just pleased I left my Vegemite and teas in my hold luggage.