Log in

No account? Create an account

myhttp_engine, read only you think? Not so!

« previous entry | next entry »
Apr. 14th, 2007 | 09:12 am

I was asked in comments "nice, but its only read only right?".

Not true at all!

For example, lets create this table:

mysql> CREATE TABLE `website` (
-> `filename` varchar(125) NOT NULL DEFAULT '',
-> `contents` text,
-> PRIMARY KEY (`filename`)
-> ) ENGINE=HTTP DEFAULT CHARSET=latin1 CONNECTION="http://localhost/"
-> ;
Query OK, 0 rows affected (0.00 sec)

Now, lets insert a page:

mysql> insert into website VALUES ("index.txt", "This is a document on a website");
Query OK, 1 row affected (30.01 sec)

Read the page from curl:

[brian@zim ~]$ curl
This is a document on a website

Now from a select:

mysql> select contents from website WHERE filename="index.txt";
| contents |
| This is a document on a website |
1 row in set (0.00 sec)

How about an update?

mysql> update website SET contents="Under Construction" WHERE filename="index.txt";
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

Any just to verify:

mysql> select contents from website WHERE filename="index.txt";
| contents |
| Under Construction |
1 row in set (0.00 sec)

And finally....

mysql> delete from website WHERE filename="index.txt";
Query OK, 1 row affected (0.01 sec)

mysql> select contents from website WHERE filename="index.txt";
Empty set (0.00 sec)

So no, its not read only at all. You can use the XMLReplace syntax to update XML documents on a webserver.

I do not have SSL working yet, but that is just a hop, skip, and a jump since I know Mark got SSL working for the AWS/S3 engine. I am using http://hg.tangent.org/mod_methods to check the work (and as I said before it needs work... its a big security risk at the moment, I would need to put in few hours of time to fix this). Anyone with some basic knowledge of how to write a CGI can put together their own via:

Script METHOD /cgi_to_execute.cgi

You will need to support HEAD/GET/DELETE/PUT.

POST is mapped to PUT currently. Someone could also just fork the engine/send me patches that would integrate well, to support webdav. Right now only two fields are supported but I will fix that by have it ship XML if need be. Suggestions on XML definition would be warmly welcome. Currently each row is a document but this could be changed.

If you are playing with this, it is better to track it from http://hg.tangent.org/myhttp_engine. I'll drop a new release next week if I hack anything interesting into it over the weekend.

Link | Leave a comment |

Comments {3}

Roy Corey

(no subject)

from: xerhino
date: Apr. 14th, 2007 07:40 pm (UTC)

That is pretty cool.

Reply | Thread


(no subject)

from: dossy
date: Apr. 14th, 2007 09:33 pm (UTC)

Since PUT doesn't care if you're overwriting documents, does myhttp_engine do a HEAD first to check to see if the object already exists before issuing PUT for an INSERT? While it would make it semantically correct, it seems like an unfortunate performance loss.

Oh, and the fun you could have issuing DELETE FROM website ... oops! :-)

What does myhttp_engine use for its underlying HTTP protocol implementation? Does it support HTTP Keep-Alive connections? I imagine you'd have to turn off pipelining if you choose to support multiple concurrent HTTP connections.

I'm already thinking about using a variant of myhttp_engine (well, a custom MySQL storage engine) to place in front of a proprietary data store at AOL. How are indexes handled for custom storage engines? Can I create an index in a MyISAM database for a table in a MyFOOBAR storage engine? i.e., I have arbitary data which I expose to MySQL using a custom storage engine, but I want MySQL to index it. Possible?

Alternatively: custom storage engine in front of Lucene for indexing. Anyone do this? (Is Lucene actually any better than MyISAM's built in full-text indexing?)

Reply | Thread

Brian "Krow" Aker

(no subject)

from: krow
date: Apr. 15th, 2007 02:35 am (UTC)

mod_methods does not allow a PUT where a file already exists. You have to explicitly remove it (and I believe that this is what the RFC requires).

myhttp is using CURL. It seems to be a pretty reasonable library. Right now I am not doing keepalive, but will probably do so in the very near future (though only for the life of a single SQL statement). Concurrency to HTTP server is based on connections from the MySQL server itself. Each thread has its own HTTP connection... I could do pooling, but someone would need to show a need for it.

Indexes are a per-engine feature. Though there is a generic hash and b-tree that is available for use.

Is Lucene better? I don't know for sure, but I suspect it is. There is one group layering a FULLTEXT engine on top of another storage engine. There implementation is a bit of a hack but it works.

You can always subclass a storage engine, and just extend it that way.

Reply | Parent | Thread