Jean Véronis
Aix-en-Provence
(France)


Se connecter à moi sur LinkedIn Me suivre sur Twitter Facebook RSS

vendredi, mars 25, 2005

Google: A snapshot of the update



As I said in yesterday's post, Google is currently undergoing major modifications, in which the problem is no more a simple index update, but an in-depth correction of extrapolation routines and boolean logic, in order to fix the count aberrations that I have shown in early February.

The operation must be very difficult, since it seems to have lasted for a month or so. Google has not yet managed to update all its "Data Centers". There seems to be three different groups of Data Centers at this point in time:
  1. some have not been corrected and still have the previous behaviour (the = 8 billion results, wrong boolean logic)
  2. some have been partially corrected (the ~ 3 billions, but boolean logic still flawed)
  3. some seem to have reached the final configuration (the ~ 3 billions, booloean logic fixed).
Example:

GroupData Centerthechiracchirac OR chirac
164.233.161.99800000000032700001750000
264.233.189.104380000000021500001970000
366.102.7.99380000000019700001970000

See complete list of results.

Various hypothesis can be made. For example, the new algorithms are still under test and for some convenience reasons, the Googles test them only on a subset of machines. Another, deeper reason could be that the update implies not only some mathematical modification in the formulaes, but also a major increase of the main index with respect to the "supplemental" index (see this post). In the latter caser, the limit could very well be a hardware one, and some Data Centers might be awaiting more powerful machines. Nobody can know exactly, but the new kind of Google dance seems quite frantic.

Stay tuned!

Libellés :


1 Commentaires:

Anonymous David Palfrey a écrit...

Thank you very much for your pursuit of 'googlean logic' - especially if it has indeed been responsible for improving google's behaiour!

I've picked up two current problems with googlean logic. I'd be interested in your comments.

(1) repeated ANDs: commutativity of AND a bit dodgy

Here are some results I obtained today:

58,800,000 for cat
48,400,000 for cat cat
48,600,000 for cat cat cat
59,600,000 for cat cat cat cat
59,500,000 for cat cat cat cat cat

176,000,000 for car
157,000,000 for car car
226,000,000 for car car car
272,000,000 for car car car car
272,000,000 for car car car car car

52,300,000 for dog
47,000,000 for dog dog
46,900,000 for dog dog dog
62,500,000 for dog dog dog dog
62,200,000 for dog dog dog dog dog

And two days ago:

157,000,000 for car
157,000,000 for car car
224,000,000 for car car car
271,000,000 for car car car car
272,000,000 for car car car car car

47,100,000 for dog
47,200,000 for dog dog
53,100,000 for dog dog dog
62,500,000 for dog dog dog dog
62,600,000 for dog dog dog dog dog

(2) Distributivity massively violated (I haven't seen this remarked anywhere else).

11,800,000 for cat AND (dog OR elf)
659,000 for (cat AND dog) OR (cat AND elf)

The first seems plausible, and the second much too low - a bad bug. Here are some relevant counts:

11,400,000 for (cat AND dog)
660,000 for (cat AND elf)

24 juin, 2005 16:11  

Enregistrer un commentaire