New data acquisitions, feedback tools and bot technology are extending a foundation of research and custom content/information development at TechBio.
Site development and improvements are continuing as need arises, or time permits.
Tuesday, January 15, 2008
Wednesday, December 5, 2007
Data Mining, Custom Research, Industry and Company Information Listings
An aspect of free information which enables huge advantages for small business, academic research and web development in a distributed, agile world exists in coordinating reference databases and deep analysis.
techBio is a resource to help develop your data context in any information domain.
techBio is a resource to help develop your data context in any information domain.
Wednesday, October 24, 2007
Current Projects at techBio
Update:
New Projects in 'The Works':
The Yummy List
BoatBoss Boating, Marine, Watersports Portal
Kooky Coconut, Cafe, Indian Rocks Beach, FL
We are working diligently on:
Snapspans
Code|Dub
The Empty Fridge
While others languish in stability:
Budget Blinds of Southern Minnesota, Bob Cotton Owner
techBio Development Group
techBio:wiki
C. Moore Fine Art
Memagon: The Shape of Memes and Memory
New Projects in 'The Works':
The Yummy List
BoatBoss Boating, Marine, Watersports Portal
Kooky Coconut, Cafe, Indian Rocks Beach, FL
We are working diligently on:
Snapspans
Code|Dub
The Empty Fridge
While others languish in stability:
Budget Blinds of Southern Minnesota, Bob Cotton Owner
techBio Development Group
techBio:wiki
C. Moore Fine Art
Memagon: The Shape of Memes and Memory
Tuesday, October 16, 2007
techBio Quietly Knows About SEO and SEM
Caveat Emptor: Neither I nor techBio are practicing search optimization or marketing specialists providing such services.
Search engine optimization is described elsewhere. Search engine marketing is a separate but typically conflated set of services.
Two unsupported conclusions come to mind.
1. As major search engines tune their algorithms to provide useful, relevant and authoritative results, the only realistic long play strategy for SEO and SEM is to provide a concise and communicative website, with simple close navigation to concise and informative webpages.
My assumptions, without any special knowledge, are that textual content* is analyzed for keyword prevalence (addressed by keyword optimization) by a simple, deceptively powerful heuristic.
a) key words and terms are collected by matching against a less-common-words dictionary, and scored by relative universal-usage uniqueness (inverse of word frequency in text containing that word)
b) score for each keyterm is weighted proportionally to position in document, with more points added for position in heading text such as title, url, h1, h2,h3, position in a list, emphasis, inverse frequency throughout domain and keyword/text ratio is scaled by this proportion
c) scores for ranked and analyzed sites and pages linking in are summed and multiplied by each keyterm, resulting in an ordered list of keywords by computed score
d) the summed value of keyterms is page is scaled by down a readability analysis score
Everything else on the page will be all but irrelevent with respect to content and design optimization. The last entry (d) in the above list checks for reasonably natural prose, loosely defined as normal distribution of words and phrases and grammatical correctness.
2) Web sites that have built some traffic will be acquired by companies with deep pockets, and their links, traffic, data, content and users merged into larger corporation's domains and services. These sites will be bought at purely economical valuations. Sites which are well developed and run smoothly, provide simple and useful tools and content, and are organically search optimized will be valued far higher than guerilla coded and jungle optimized websites.
Anyway, that is what I am betting my time and effort on.
* Assuming for text content, in order of priority: title, url, included URLs, remote incoming links and inline local links, and all page text between markup
Search engine optimization is described elsewhere. Search engine marketing is a separate but typically conflated set of services.
Two unsupported conclusions come to mind.
1. As major search engines tune their algorithms to provide useful, relevant and authoritative results, the only realistic long play strategy for SEO and SEM is to provide a concise and communicative website, with simple close navigation to concise and informative webpages.
My assumptions, without any special knowledge, are that textual content* is analyzed for keyword prevalence (addressed by keyword optimization) by a simple, deceptively powerful heuristic.
a) key words and terms are collected by matching against a less-common-words dictionary, and scored by relative universal-usage uniqueness (inverse of word frequency in text containing that word)
b) score for each keyterm is weighted proportionally to position in document, with more points added for position in heading text such as title, url, h1, h2,h3, position in a list, emphasis, inverse frequency throughout domain and keyword/text ratio is scaled by this proportion
c) scores for ranked and analyzed sites and pages linking in are summed and multiplied by each keyterm, resulting in an ordered list of keywords by computed score
d) the summed value of keyterms is page is scaled by down a readability analysis score
Everything else on the page will be all but irrelevent with respect to content and design optimization. The last entry (d) in the above list checks for reasonably natural prose, loosely defined as normal distribution of words and phrases and grammatical correctness.
2) Web sites that have built some traffic will be acquired by companies with deep pockets, and their links, traffic, data, content and users merged into larger corporation's domains and services. These sites will be bought at purely economical valuations. Sites which are well developed and run smoothly, provide simple and useful tools and content, and are organically search optimized will be valued far higher than guerilla coded and jungle optimized websites.
Anyway, that is what I am betting my time and effort on.
* Assuming for text content, in order of priority: title, url, included URLs, remote incoming links and inline local links, and all page text between markup
Thursday, October 11, 2007
Arrived at the party
Technorati Profile
TechBio is spending lots of time and experiment with passive revenue from Google AdSense, affiliate referrals and long tail development through Snapspans.
It is a test bed for consulting work in the present and near future.
TechBio is spending lots of time and experiment with passive revenue from Google AdSense, affiliate referrals and long tail development through Snapspans.
It is a test bed for consulting work in the present and near future.
Monday, October 1, 2007
Website As Application Service Provider and Smart Access Point
So, a website is what serves webpages, right? Yes. And WAP, RSS, XML-RPC, URN-database-requests, SOAP, REST, API calls.
Oh.
And Web apps.
Oh.
How about all of these? Uh-oh.
I am pondering the flexibility of websites in light of:
http://go-pear.org
http://cheat.errtheblog.org
Mass marketing is not for computers. Trust me.
I wrote a script which:
- downloads classed webpages
- counts unique words on each page
- counts by class and site
- pretty good approximation for textdiff
- outputs weighted rankings
I need to see if the generated list is a good classifier of pages by class.
I would like a statistician to develop a metric of this performance.
Subscribe to:
Posts (Atom)
