Not signed in (Sign In)

Categories

Vanilla 1.1.4 is a product of Lussumo. More Information: Documentation, Community Support.

Help keep Vanilla free:
Welcome Guest!
Want to take part in these discussions? If you have an account, sign in now.
If you don't have an account, apply for one now.
    • CommentAuthorsukibabee
    • CommentTimeFeb 9th 2007
     # 1
    Evaluating forum packages has been fairly depressing - until I stumbled upon Vanilla. Finally something that isn't yet another clone of every other ugly forum out there (beside bbPress - but Vanilla's better). So I download it - installed it in minutes ... and I love it. But I've got questions. The big one for me is ... will this baby scale? I'm sure I'm not the only one who hopes the site I create will become popular. But it surprises me there aren't more discussions about this topic.

    I've got some more directed questions that other posts :

    -Searches. I don't see any kind of indexing of keywords - nor use of MySQL's "fulltext" feature (built-in keyword indexes for full-text searching). Keyword indexing scares me -it's complicated and bulky - but it's fast. I believe phpBB has dedicated tables to do so. So what gives? I find searching this forum pretty speedy. Does anyone have a good reason why "fulltext" should not be used? Are there issues with hyphenated strings - or multi-lingual?? Are there any plans for some kind of keyword indexing in the future?

    -Caching. There isn't any. Caching HTML and Database queries are the natural choices. Though even with my very limited knowledge of the app - I see problems with both of these - given that Vanilla makes a lot of use of recent data. But - for instance - couldn't caching the main "discussions" page be done on a per user basis - with invalidation able to be performed per user (e.g. when user reads a discussion and the highlight changes) and globally (e.g. when a new post is made) basis? Load balancing this would be the tricky part -see next question. Anyway - are there plans to create a caching mechanism ?

    -Load balancing. This is most important. Replicating or clustering the database can be done -that's not the issue. Having multiple web servers is. I see that all data is stored in a database - except images (which can be centralized via NFS anyway - provided file locking issues aren't a problem). is that all true? If so - it just comes down to session handling :

    I see Vanilla uses PHP's std built-in "SESSION[]". This (by default) creates a PHPSESSID cookie with a key - which maps to a file holding the serialized session variables. While this is fine for most - I have a problem with it -as these "server-side sessions" (meaning that session data is stored on the server) do not enable sessions past one server. Load balancing Vanilla would involve forcing sessions to go back to the machine that created them. While load-balancers (such as "Pen" and "Pound") are capable of doing this - it really doesn't lend itself to "load balancing". Over time - you find one machine is over-loaded with users who never sign off - while others are sitting around doing nothing.

    I'm no expert - but I see Zend provides a package with cross-session support. Prob real expensive - and besides - it's a bandaid approach. Beyond that - I don't see a way of achieving this without creating custom session handling. I see a lot of custom stuff in Vanilla - and I think smart session handling would be a great option.

    So an alternative to "server-side sessions" are "client-side sessions". Rather than have a cookie holding an ID (which maps to your session data) - you put the session data in the cookie. That way - HTTP traffic can simply be round-robin'ed to any web-server - as the cookie provides all the info about the user. And what info is that - all I can see in Vanilla's session info is a UserID (and a blank Password?? - which I'm sure isn't needed). Of course - security is more of an issue with client-side sessions - but can be solved in many ways. I have developed this technique for a large website and it works really well (and just as secure). The hard part with Client-Side Sessions is to log someone out (completely) to limit session hijacking. I solved this - and I can explain it if anyone is interested. But I 'm pretty sure Vanilla could solve this more easily .. though hey .. enough detail. I wrote too much an hour ago..

    Anyway - I'd like to know if anyone has looked into doing this - and what might be involved. I think Vanilla could use an 'alternate client-side session manager' for those wishing to scale. Perhaps it could be an option - I think the feature would be welcomed by many wishing to grow. From my quick poking - I see the People.Class.Session class could be replaced - while needing some modification to People.Class.Authenticator.

    Is it that simple?
    •  
      CommentAuthorADM
    • CommentTimeFeb 9th 2007 edited
     # 2
    Some good questions there. To answer your question about caching, it's currently not possible due to the fact that whispers are present on the main dicussion pages and as such they differ between users.

    Though I guess Vanilla could cache things like total post counts for each topic so it doesn't have to query that as often. I know that vBulletin does that (well at least has the option to do that).
    •  
      CommentAuthorMax_B
    • CommentTimeFeb 9th 2007
     # 3
    I asked Mark already, when joining, about the fulltext point. His answer is probably in a search result.
    •  
      CommentAuthorDinoboff
    • CommentTimeFeb 9th 2007 edited
     # 4
    !!! I don't know anything about load balancing,

    Vanilla use the php built-in session manager. By default php save the session data in local files but you can set php to save them in a database..
    •  
      CommentAuthorMax_B
    • CommentTimeFeb 9th 2007
     # 5
    @sukibabee
    Hi, I reply publicly because it may be useful for others.
    I apologize for my imprecise statement, the relevant comment is on VanillaDev.
    • CommentAuthorsukibabee
    • CommentTimeFeb 15th 2007 edited
     # 6
    Thank you all for your answers.

    It's a shame about caching. After looking into Vanilla code more, I see it would be really difficult to get right - especially given the extensible nature of the product. Plus it's not really aimed for large forums. Fortunately, throwing more hardware at the problem suffices.

    Dinoboff, I think you are talking about the session_set_save_handler() routine to define user-defined session storage routines, right? Thanks for the pointer. You're right that, provided the callbacks are established before session_start(), I can easily override the storage of the PHP Session ID, and redirect storage to (for instance) a database. PHP itself still creates that Session ID (with extreme low-prob that it will conflict with another server) - but you can simply ignore it. You can use custom cookies and implement everything yourself.

    It may be a shock to Vanilla users ... but Databases are slow man. You wanna avoid them - esp on high traffic sites where load-balancing is required. In my case, I'd like to integrate Vanilla on a high-traffic site that already has an efficient client-side session mechanism. Retrieving session variables from the Vanilla database on every page (to enable a user access to other parts of the site) isn't acceptable. Most discussions here relate to molding a site to Vanilla, whereas I'd like to do the opposite.

    So - for others wishing to do the same, check out session_set_save_handler(). I haven't tried, but I think it's the cleanest way to alter Vanilla's session handling.

    A see a quick'n'dirty way also : achieved by using Vanilla's "PersistentSession" (remember me) logic, and assigning different session names (the default is "PHPSESSID") on each servers. A persistent session is stored in the database, but only retrieved once for each web-server (which creates its own session, with a unique session cookie name). The "remember me" would have to be forced on, and the logout code needs to clear all the web-servers' session cookies .. and session data can't change. But - it's quick.
    •  
      CommentAuthor[-Stash-]
    • CommentTimeFeb 15th 2007
     # 7
    why couldn't session data change? Couldn't it just be sync'd across the servers when it did change? I mean it should change too often should it?
    • CommentAuthorsukibabee
    • CommentTimeFeb 15th 2007
     # 8
    Hey Stash - yes, you're right. You'd just have to delete all the session cookies and the database would be read again using the same "PersistentSession" mechanism. Damn you guys are smart. :)
    •  
      CommentAuthorTomTester
    • CommentTimeFeb 15th 2007
     # 9
    @sukibabee: in your quest for scalable forums, are there other candidates you are considering/like better?

    @ADM: you've just given me another reason to hate/disable whispers... by the way, I see little reason why
    (without whispers) if would be impossible to cache certain pages (even if it were just for 30 seconds).

    I once ran a pretty large and forum that 'peaked' between 5 and 8 pm daily. We found that even short query
    caching expiry times (e.g. 5 seconds) made a HUGE difference in resource requirements during peak traffic...
    Further optimization, incl. 'page-chunk' caching, user level differentiation (e.g. guests were read-only, no
    posting privileges, hence it made very little difference to them if they were looking at a cached page that
    was 5 mins or a 'fresh' page that was 5 seconds old) allowed us to handle 20x more concurrent users.
    • CommentAuthorsukibabee
    • CommentTimeFeb 15th 2007
     # 10
    @tomtester : I have a few suggetions for you.

    I found apples and oranges in the forum world. let's call them "bad apples" and oranges. The bad apples all look alike - I find them unacceptable from a user-interface perspective (and not easily corrected). These are written by the engineer, so a lot of them are built to scale (phpBB for instance). For the oranges, I found only two : Vanilla and bbPress. I stopped looking when I found Vanilla, there may be more. So suggestion 1 is : look for more here - http://en.wikipedia.org/wiki/Comparison_of_Internet_forum_software

    Some form of caching is built into bbPress (I'm not sure how much) - but from the little I know, the caching is broken right now. bbPress doesn't index keywords, but does use the "fulltext" feature of mysql. It is very likely than bbPress will scale much better than Vanilla. But the main problem with bbPress is that it is still being developed. It isn't as slick as Vanilla, but that may also change down the road. So suggestion 2 : check bbPress out.

    Another is to build caching in Vanilla. I'm real new here, so take what I say with a grain of salt (everything below is 'my opinion', not fact) :

    While I respect and consider what others say, I actually think building a caching mechanism in Vanilla is possible - just pretty darn hard. I'll stick to discussing database query caching, since that interests you. So, there's smart caching and dumb caching :

    Dumb Caching : As you point out, building a "dumb cache" is dirty but very effective. it involves caching data for a page and blindly using it for a set period of time (you used 5 seconds). It said it can't be done because of "whispering". i think it just changes the implementation of the cache. Take the main discussions page. When Bob is signed-in, he sees all the discussions like most people see them, but popular Sally (who gets a lot of whispers), sees a different list. To do effective caching, you need to cache per user. Imagine the cache is a directory structure, the first directory is the user_id and cached data held under them. So when Bob accesses the main page, a cache file is generated and put into his cache directory. When he refreshes the page, his cache file is fed back to him. Same for Sally. You could argue that dumb caching per user doesn't help, but visitors will count as one user - so all visitors share the same cache file. Still - it's pretty weak. If you turn off whispering, maybe you don't need per user caching (but I still think you do - see my note below about this).

    Smart Caching : This is where you cache data and use it until the data changes. This is done by purging the cache file (aka invalidating the cache). The trick is to detect when the cache should be invalidated. If whispering is turned on, and caching is required per user, invalidating everyone's cache files becomes a challenge. It's not acceptable to go remove a thousand cache files (one per user). So a global cache directory should be made, and timestamps compared as part of cache fetching. Per user caching would be very effective in this case. But if whispering is turned off, per user caching *may* not be required (but I still think so) - and would greatly simplify the implementation.

    Up till now, I've been talking about caching data for 'pages' - when in fact the code gets data in terms of a database query. For dumb or smart caching, the cache has to be keyed by the inputs to the database query. Fortunately, the code is already geared for this. Database queries are constructed by calls to the SqlBuilder class. The Select() and GetRow() calls of the Database class are the points to implement the caching mechanism. The cache filename is a construction of all the inputs to the database query. This all sounds easy until you look at the vast number of queries that take place in Vanilla. And some of these inputs may be things like the current time or others that constantly change. So it has to be somewhat smart. It's kinda scary. Cache invalidation would have to happen in the same place - calls in the database to update the data. I think only Mark could write the algorithm to match select inputs with update inputs to do cache invalidation. It looks way too complicated. I doubt he would be interested to do this, it looks really hard.

    The other option is to cache data at certain places outside the Database class. But this requires invalidation to be done at certain places too. Since it is not a generic cache, it'll likely break and be difficult to maintain.

    so, the note about whispering : The other thing I see unique per user is the highlighting showing which entries have been read and which are new. This is done via the LUM_UserDiscussionWatch table. When a user views a discussion, a timestamp entry into this table is updated. So the main discussions page will still be a unique database fetch per user, even without whispering activated. So I think that feature makes per-user caching mandatory also.

    So my suggestion 3 : don't even try to implement a cache in Vanilla. In writing this up, I've convinced myself that it's just too hard man.

    Suggestion 4 : search for "FARM_DATABASE_HOST" - I don't know much about it, but it looks like Vanilla somehow has built-in support for a database farm. Combine that with web-server load-balancing, and your forum scales - though at a cost.
    •  
      CommentAuthorWallPhone
    • CommentTimeFeb 16th 2007
     # 11
    I whish there was a easy way to invaidate a browser-side cache.

    I do believe I found a easy way to smart cache easily for guests... could be expanded to work for some users.

    My shared hosting is atrocous in database retrival times. Maybe I'll give it a go.
    • CommentAuthorsukibabee
    • CommentTimeFeb 16th 2007 edited
     # 12
    WallPhone - can't wait to hear what you come up with.

    I thought of something else :

    It seems the two features of Vanilla that hold back (global - not per user) caching is whispering and the visual cues that depict whether or not you've read a discussion. If you disable both of these features, there's nothing to stop caching the data for the discussions page for both guests and users. Am I right? Everyone knows how to disable whispering. The quick way to disable the "have I read this" cue is to change the style-sheet. Your users will learn to use the forum without that feature.

    In this case, caching shouldn't be so hard. It would be better to invalidate the cache, rather than have a short-timeout. You just need to identify the points of code where invalidation needs to take place ... e.g. when a comment is added, a sink is done... etc. Grep for Update - can't be too hard, but there are probably a lot of these.

    The same technique may work for the data associated with viewing each discussion - but I haven't looked into this. It may involve user-specific database queries (even with whispering disabled?). not sure.

    With the visual cues disabled, it would also be beneficial to disable the database update to the LUM_UserDiscussionWatch table that takes place every time a user opens a discussion. In a replicated database environment, write rates are the limiting factor. I'm guessing this would help a lot.
    •  
      CommentAuthorTomTester
    • CommentTimeFeb 17th 2007
     # 13
    @Suki

    I'm quite familiar with various cache levels/methods.

    'REAL-TIME'
    Experience has taught me that REAL-TIME data is rarely required in a forum setting (it's not
    instant messaging). Just like a CD-Player can give a good representation of an analog signal,
    'dumb-cached' pages/page chunks can provide a 'representation' of a forum that works 'good
    enough' for most users.

    STUPID is SMART
    Since I'm pretty sure nobody can anticipate ALL exceptions, not even if you're Mark, and
    complexity ALWAYS increases the # of bugs, I'm a big proponent of simple (80%) solutions
    and thus 'dumb caching' pages or page elements if the way to go for me.

    SCALE & SACRIFICE
    I'd be more than happy to sacrifice some of the 'NICE BUT NOT REQUIRED' features for speed &
    performance...

    I'm sure even the exact features that PREVENT caching could be replaced by *similar* ones that
    do not require a user-specific database lookup (e.g. local cookies) or can be SPLIT into generic
    and user-specific queries (where the former could be cached).

    Finally, whispers could perhaps be re-coded, e.g. retrieved as a separate, non-cached query and
    then merged on the server, or even off-loaded to the user's machine via JavaScript for page
    creation, interleaving 'regular' comments and whispers?

    Just a thought...
    • CommentAuthorToivo
    • CommentTimeFeb 17th 2007
     # 14
    "Some good questions there. To answer your question about caching, it's currently not possible due to the fact that whispers are present on the main dicussion pages and as such they differ between users."

    Present whispers as a new tab\page?
    •  
      CommentAuthor[-Stash-]
    • CommentTimeFeb 18th 2007
     # 15
    TomTester, I like the idea of the whispers perhaps being separated and then merged in, if that's possible.
    • CommentAuthormonkie
    • CommentTimeFeb 18th 2007 edited
     # 16
    silly comment - self-censorship activated
    •  
      CommentAuthorTomTester
    • CommentTimeFeb 18th 2007 edited
     # 17
    @Stash: of course it should be possible... anything is possible... this is the U.S. of A! ;-)
    I should really stop proposing things and look at the code myself.

    @Toivo: the 'strength' of whispers however is the fact that they appear in-line with the
    other comments. A separate TAB makes them into private messages (not bad,
    but also not the same).
    •  
      CommentAuthor[-Stash-]
    • CommentTimeFeb 18th 2007
     # 18
    Actually, this is the UK, the Ukraine, Australia, Germany... but hey ;)
    •  
      CommentAuthory2kbg
    • CommentTimeFeb 18th 2007
     # 19
    hahah
    • CommentAuthorToivo
    • CommentTimeFeb 19th 2007 edited
     # 20
    aah,

    but the strength of whispers also comes from the fact that these look like ordinary comments and you post them just like ordinary comments (same look, same logic, as clean as)?
    •  
      CommentAuthory2kbg
    • CommentTimeFeb 19th 2007
     # 21
    exactly they allow the, private sub conversation to flow along with the main one.
    •  
      CommentAuthorTomTester
    • CommentTimeFeb 20th 2007
     # 22
    I know/agree y2/Toiv *but* if the present method prevents caching of forum pages, would you prefer to:

    A. - Live without Whispers altogether
    B. - Live with Whispers on a Tab (private messages)
    C. - Live with *another* implementation of Whispers that looks the same but is implemented differently (e.g. via browser-level 'comment/whisper-interleaving')
    D. - Live with the current Whisper stuff and screw all those speed-freaks

    Of course I'm being facetious here... but I really believe the present Whisper implementation (comments
    marked as private) is sub-optimal... I *think* it limits page 'cache-ability' and also at the root of several
    'positioning' errors of various extensions.

    Also note: initially I did *not* get the benefit of the Whispers over Private messages. I think Whispers
    are one of the more difficult things to convey ton Vanilla-'noobs' and experienced 'other forum'-users.

    In that context it may make sense to re-think the Whispers too.
    •  
      CommentAuthor[-Stash-]
    • CommentTimeFeb 20th 2007
     # 23
    C sounds good, when can we expect it? :)
    •  
      CommentAuthorMark
    • CommentTimeFeb 20th 2007
     # 24
    It may be a shock to Vanilla users ... but Databases are slow man. You wanna avoid them - esp on high traffic sites where load-balancing is required. In my case, I'd like to integrate Vanilla on a high-traffic site that already has an efficient client-side session mechanism. Retrieving session variables from the Vanilla database on every page (to enable a user access to other parts of the site) isn't acceptable.

    Too right! I've been telling people that about databases for a long time, but they just don't want to hear it. All of Vanilla's configuration settings are in php-editable files for precisely that reason: I don't want to have to pull those out of the database on every page load. I think you'll be hard-pressed to find any database driven forum that doesn't call the database for some data on every page - if it's not user related information, it will be general configuration settings.

    I made Vanilla pull the user's id and role information on every page load so that permissions are always up to date. If I ban you, you won't be able to see the forum the next time you load a page. Other forum packages have problems with things like this - you ban someone and they stay signed in until their session expires and they have to re-authenticate (either by re-signing in or by cookie).
    •  
      CommentAuthorTomTester
    • CommentTimeFeb 20th 2007 edited
     # 25
    @Stash:

    Speaking of non-database use... (and perhaps useful as a start for browser-based whisper interleaving) did anyone see this:

    http://simile.mit.edu/exhibit/
    Exhibit
    Exhibit is a lightweight structured data publishing framework that lets you create web pages with support for sorting, filtering, and rich visualizations by writing only HTML and optionally some CSS and Javascript code.

    It's like Google Maps and Timeline, but for structured data normally published through database-backed web sites. Exhibit essentially removes the need for a database or a server side web application. Its Javascript-based engine makes it easy for everyone who has a little bit of knowledge of HTML and small data sets to share them with the world and let people easily interact with them.
    [...]

    How Does Exhibit Work and Why Use It?
    Exhibit consists of a bunch of Javascript files that you include in your web page. At load time, this Javascript code reads in one or more JSON data files that you link from within your web page and constructs a database implemented in Javascript right inside the browser of whoever visits your web page. It then dynamically re-constructs the web page as the visitor sorts and filters through the data. As the visitor interacts with the web page, only the web browser is responsible for providing the interaction; the web server is no longer needed.

    So, where's the database, again? The data is stored in JSON files, and the database is implemented in Javascript and running inside the web browser.

    The advantages of Exhibit are as follows:

    * No traditional database technology involved even though Exhibit-embedding web pages appear as if they are backed by databases. So you don't have to design any database, configure it, and maintain it. After all, if you only have a few dozens of things to publish rather than thousands, why would you spend so much effort in dealing with database technologies?
    * No server-side code required even though Exhibit-embedding web pages are heavily templated. So, there is no need to learn ASP, PHP, JSP, CGI, Velocity, etc. There is no need to worry which server-side scripting technology your hosting provider supports.
    * No need for web server if you only want to create exhibits and keep them on your own computer for your own use. They work straight from the file system.
    [...]
    •  
      CommentAuthor[-Stash-]
    • CommentTimeFeb 21st 2007
     # 26
    Certainly sounds interesting. So this basically puts the strain on the end user rather than the server? I wonder how this affects mobile devices?
    •  
      CommentAuthorTomTester
    • CommentTimeFeb 21st 2007
     # 27
    Hmmmm, you got a point there that this of course it won't work on mobile devices...
    (at least not the CURRENT mobile devices).

    Case in point:
    My blackberry 8700g chokes and ends up with an 'uncaught exception' (i.e. nobody
    would even dare call this choking gracefully).

    Of course where there's problems, there's solutions... Theoretically you can switch
    page generation to the server for specific browser IDs etc. Perhaps people on PDAs
    can also live without Whispers ;-)
    •  
      CommentAuthorTomTester
    • CommentTimeFeb 21st 2007
     # 28
    (PS) Of course I *never* even tried Vanilla on my BlackBerry... Color me impressed,
    it's darn fast *and* pretty. The only problem I see is a repeat of the USER ICON
    causing a whole line of 'stashes'
    • CommentAuthorJotango
    • CommentTimeFeb 22nd 2007
     # 29
    We are evaluating Vanilla for our environment. Since we have made it policy to not allow mySQL searches anymore (don't scale, kill the database) we would integrate Vanilla with Solr (Lucene). If we choose Vanilla for our new forums we would contribute the code. But don't hold your breath, first we need to integrate Vanilla with out user/login system. Are there tutorials for that by the way, besides the one for integrating Vanilla with Wordpress? Interesting for us would be: a) how to use different login credentials and b) how to munge Vanilla with external data. Thanks
    • CommentAuthormooseroo
    • CommentTimeMay 1st 2007
     # 30
    What kind of performance are others getting out of Vanilla? When you're talking large-medium-small scale, what sizes/levels of activity are you actually talking? It would be nice to get at least a ballpark estimate of the volume of traffic, concurrent users, other measures of performance that you're talking about.

    Maybe a bit about the processor/memory/OS you're running as well.

    I too am evaluating Vanilla for a larger installation. It would be nice to know if we're on the same scale with one another.
  1.  # 31
    I've been using Vanilla for my forum since March - it was an active forum, the database reached over 300 MB in size.

    The forum has since been brought down - the requests on the CPU using a forum of that size crashed a server.

    There is apparently an upscaling problem with the code - MySQL seems to handle it the large DB OK, the forum takes forever to load. The problem is particularly acute on the home page.

    Unless there is a remedy to this (either trimming down the posts in a safe way or scaling up the solution), I'm going to have to roll the dice, empty out some tables and start again.
  2.  # 32
    I'm assuming you were using a shared server? How many users did you have? And how many posts a day?
    •  
      CommentAuthorMax_B
    • CommentTimeNov 9th 2007
     # 33
    Note also that there have been issues reported here with some extensions adding load.
    •  
      CommentAuthorDinoboff
    • CommentTimeNov 9th 2007
     # 34
    Do you have access to your slow query log?
    •  
      CommentAuthorrel1sh
    • CommentTimeJan 9th 2008 edited
     # 35
    I suspected that posting to this thread would be inevitable, and the time has come, so here goes.

    My relatively new database is getting hammered by Vanilla. It's an HP DL380 G5, dual-Xeon 2.33GHz (quad core, 8 logical CPUs), 16G RAM, running FreeBSD 6.2-REL. This server also serves about 20 other sites, which (without Vanilla running) peak at about 30K queries/minute and never get the server load above 0.50.

    When I let my users onto the Vanilla forum, the load climbs to around 2.5-3.0 and requests to the front page and any discussion page take ~30s to load. Using mtop to monitor the queries, I can clearly see that the Vanilla queries are the ones holding up the gravy train, taking ~30s to sort/send the top 10 and locking the rest. The database is average sized, about 62M with 210,000 comments and 2,500 discussions. I've run EXPLAIN queries on these statements which are taking the longest, but nothing pops out at me as being glaringly wrong.

    There are also a large number of database proccesses in the Sleep state which accumulate at random intervals on my Vanilla database, which I thought I had disabled by turning MySQL persistence off in the php.ini file.

    Extensions installed:
    AjaxQuote
    Audioscrobbler
    CommentRemoval
    CommentsPermalinks
    DiscussionFilters
    EmericaCrossOver (custom extension which just force-forwards people to our main site login/logout/register pages)
    HtmlFormatter
    IPBlocker
    JQuery
    Legends
    PanelLists
    ParticipatedThreads
    PreviewPost
    PrivateMessages
    UserTasks
    WhosOnline
    YellowFade

    I'm tempted to go out and get a handle on using squid or another caching method (SimpleCache maybe?), but perhaps I can get some insight here first. Are any of the extensions I've got installed known to be dog-slow that I can disable first?
  3.  # 36
    How many concurrent active users/pageviews do you have on vanilla? This site runs on a far less powerful machine than yours (nice boxes the G5's, we have a few at work for virtualisation. lots of shiny blue lights too :)) and obviously is running fine. Since it has double the number of discussions and presumably comments that seems a bit strange - though yours could be a LOT more active just newer, if you see what I mean. I know this box also hosts 5+ other sites but I'm not sure exactly how many.

    The only extensions I can see on there which might be worth looking at (i.e. disabling, see if it makes a difference) are ParticipatedThreads (though i believe this runs on a seperate page so shouldn't be too bad unless users are using it a lot), maybe PrivateMessages (havn't checked out how this works but generally speaking whispers put a lot of load on DB calls - i guess in theory PM should reduce that load but I'm not 100%).

    Which queries are they that are taking up all the power? I'm guessing probably the one for the all discussions page?
    •  
      CommentAuthorrel1sh
    • CommentTimeJan 9th 2008
     # 37
    Ya, the time-consuming queries appear to be the ones for standard Discussion pages (not the front page). Here's a generic example that seems pretty representative of the ones taking the longest. This one was nearing the 60s mark when I ran the process list:

    SELECT m.CommentID AS CommentID, m.DiscussionID AS DiscussionID, m.Body AS Body, m.FormatType AS FormatType, m.DateCreated AS DateCreated, m.DateEdited AS DateEdited, m.DateDeleted AS DateDeleted, m.Deleted AS Deleted, m.AuthUserID AS AuthUserID, m.EditUserID AS EditUserID, m.DeleteUserID AS DeleteUserID, m.WhisperUserID AS WhisperUserID, m.RemoteIp AS RemoteIp, a.Name AS AuthUsername, a.Icon AS AuthIcon, r.Name AS AuthRole, r.RoleID AS AuthRoleID, r.Description AS AuthRoleDesc, r.Icon AS AuthRoleIcon, r.PERMISSION_HTML_ALLOWED AS AuthCanPostHtml, e.Name AS EditUsername, d.Name AS DeleteUsername, t.WhisperUserID AS DiscussionWhisperUserID, w.Name AS WhisperUsername
    FROM LUM_Comment m
    INNER JOIN LUM_User a ON m.AuthUserID = a.UserID
    LEFT JOIN LUM_Role r ON a.RoleID = r.RoleID
    LEFT JOIN LUM_User e ON m.EditUserID = e.UserID
    LEFT JOIN LUM_User d ON m.DeleteUserID = d.UserID
    INNER JOIN LUM_Discussion t ON m.DiscussionID = t.DiscussionID
    LEFT JOIN LUM_User w ON m.WhisperUserID = w.UserID
    LEFT JOIN LUM_CategoryRoleBlock crb ON t.CategoryID = crb.CategoryID
    AND crb.RoleID =1
    WHERE (crb.Blocked = '0' OR crb.Blocked =0 OR crb.Blocked IS NULL)
    AND (m.Deleted = '0' OR m.Deleted =0 )
    AND (m.WhisperUserID = '0' OR m.WhisperUserID =0 OR m.WhisperUserID IS NULL)
    AND m.DiscussionID = '5100'
    ORDER BY m.DateCreated ASC
    LIMIT 4980 , 20

    Upon closer inspection, I noticed that they were all hitting the same discussion, the "conversation" thread on my forum with over 27,000 comments. Digging deeper still to see who's hitting my forum despite currently having a RewriteRule to elsewhere on the site, I'm getting multiple simultaneous hits from GoogleBot and the Yahoo crawler. Perhaps this has something to do with it.
  4.  # 38
    Yeah I suspect having a 27k comment discussion would hurt a bit.... Can you try closing that discussion for a while and see if it helps?
    •  
      CommentAuthorMax_B
    • CommentTimeJan 10th 2008
     # 39
    Running EXPLAIN on your request probably shows, on the second line that MySQL must examine 27000 lines and then, from the extra information of the first line, run a filesort on the result. This is definitely a bad request.
    I'm curious to know how much adding an index on m.DateCreated (to speed sorting) would help. I tried it on a test install but the explain does not reflect the new index, as I though it should, and still mention 'using filesort'.

    Anyhow, robots on such a discussion are not welcome.
    • CommentAuthorscherem
    • CommentTimeJan 13th 2008
     # 40
    Is there a way to auto limit threads ,or shut them down when they get too large?
    •  
      CommentAuthorrel1sh
    • CommentTimeJan 14th 2008
     # 41
    @Minisweeper
    I'm sure that's not helping. I actually just enabled PFC and it seems to have been quite well received in lieu of a dedicated "conversation" type thread.

    @Max_B
    I did run EXPLAINs on some of the queries and noted the filesorts, which obviously would be best to avoid. I'm not sure if there's a way in Vanilla to specify which index(es) to use, but that would be a nice addition to Mark's SQLBuilder class if it isn't.

    post-mortem:
    Thanks for the help everyone. After forbidding robots from accessing the forum, load has gone down by orders of magnitude. I'm serving to ~15-40 simulataneous users with the load peaking around ~0.50 on the database end. As it should be.
  5.  # 42
    Must have been a helluva lot of robot activity! Glad you got it sorted though...
Add your comments
    Username Password
  • Format comments as