My science/technology-related thoughts, sometimes controversial, sometimes can be based on limited knowledge base, logic can be non-perfect as well. I develop my vision in iterations. Don't take this blog as an attempt to convince anybody in anything.
Each post in this blog reflects my level of understanding of Tectonics of the Earth at the time the post was written; so, some posts may not necessarily be correct now.

22 April, 2013

Ask not what a search company can do for you .., or time to refactor the global knowledge base.

The time for great updates in academics has come, - that's the idea of this post; the companies to refactor the "spaghetti" global knowledge can be type of Microsoft, Google, Facebook, Wikipedia. The idea to write the post came to me when doing Google search with the phrases "language processing" and "processing language". On the first phrase the results were mostly related to natural language processing. On the phrase "processing language" the results were mostly related to the open source programming language "Processing" (thanks guys).

I thought that something wrong was about the search because regardless of word order, the two words relate to the broad AI-related scientific concept. But the programming language "Processing" is just an instance of programming language concept probably not related to AI at all. The AI scientific community would need to spend extra time to figure out how to filter out the results if possible at all. That's not only about scientist's productivity, that's about quality of search, quality of scientist's work. 

Ask not what a search company can do for you.
So, how to deal with the issue of "blurred search":

- Are we to ask the search company to tweak something in their search algorithm, by, say,  heavily enhancing their model to provide the kind of option of "academic layer"? If a scientist selects the option, the results would mostly be "academic", about concepts, not about particular realizations (instances).

- Or should we ask all the web-content creators to follow some basic rules when choosing this or that English word to let their product stay high in search results?

Neither of the above, in my opinion.

Time to refactor the global knowledge base.
These above options are just "patches" to current working system. But, does it make sense "to patch" the current working system or, probably, it would be better off to develop a new approach and build up a new "global knowledge management system" (without destroying the working legacy one)?

Let's step back and look at the bigger picture. The issues are:
- A barrier for a scientist to expose his work. Publishing is expensive and takes a lot of time. A work that is not backed up by a good amount of money or authority has a little chance to serve the science. Instead, a not-so-good idea backed up by some kind of authority would make its way up, raise money and pay back to maintain the authority. The system gets counterproductive, it doesn't always serve the entire society.

- A barrier for a scientist to access works of other scientists. The access is mostly not free. To produce a science-related work one needs to look through, say, hundred of works of others. Where is he expected to get the money? To serve science it takes a lot of money. Is that good? Again, the system gets counterproductive, the researches, who are willing to contribute to science, don't always have access to needed information.

- A barrier for fellow scientists to review the science-related work. The system is not transparent in this respect. As far as I understand, there exists layer of "middlemen" to decide who would peer review whom. I believe it should work mostly automatic. 

- Lack of community feature. The community should feature not just "peer review" practice. It should provide tools for collaboration.

- The scientific works are mostly examples of "spaghetti" knowledge. The ideas are often NOT a)reasonably normalized, b)separated into loosely coupled coarse grained items with clear input conditions, output statements and a body of logic. The "spaghetti" structure doesn't allow a scientist to reuse the logic of scientific works in automatic mode.

- Automatic connection to a "brain" or to a "mind" project can't easily be done. Reverse engineering of a system (brain) only makes sense if the understanding of the output (knowledge base in this context) of the system is not in "spaghetti" state.  

- Can't easily be done automatic connection to the kind of "Language learner hub" (see my post ) .

- Can't easily be done automatic connection to the kind of "Pattern Repository and Expert System Over It" (see my post ) .

How and whom to refactor the global knowledge base.
- Concept of the structure of a scientific work. A company type of Microsoft, probably, would be the best to lead the development of the concept. The complexity of products they have been dealing with for decades hints that their experience can be reused to define what could be a knowledge item, how to "coarse grain" and decouple knowledge items within a scientific work, how to "entry point" a scientific work down to a particular knowledge item, etc etc.  

- Scientist global identification. The players type of Microsoft, Google, Facebook are quite good at it.

- Tools to develop a scientific work, - MS Office and Visual Studio, Open Office etc.

- The concept of collaboration. Companies like Facebook and Wikipedia have proven experience in the area, why not to reuse it?

- Search system over the global knowledge base, - major search companies.

What to start with.
A user identification is working already by Microsoft, Google, Facebook and others. Next steps could be:
- Some portal with email address a user can send his work to. By message_id the publication (content of the email) should be accessible to everyone.
- A set of templates for office software or even Visual Studio should be available to let a user to compose and properly format his work.
- Some tag system to label the works.
- Some community functionality to let users to organize into groups and "peer review" each other.

Probably it's time not to only navigate through academics or scholar content, it's time to start creating knowledge in new format and refactor old content.

Thank you.
Sergey D. Sukhotinsky.
Message-ID: <DUB119-W2584966068D58EDBF2EE2DBCB0@phx.gbl>
From: Sergey Sukhotinsky <>
To: Sergey Sukhotinsky <>
Subject: Ask not what a search company can do for you .., or time to refactor the global knowledge base.
Date: Mon, 22 Apr 2013 08:31:07 +0300

No comments:

Post a Comment

Popular Posts

Follow by Email

Content © 2006-2014 Sergey D. Sukhotinsky