I am making a website to showcase artwork. I want to add an auction functionality. I will use Angular framework and wondered if anyone had any experience doing this.
Functionality should be as follows:
Only admin users through an admin page with the specified password can add art for auction
Users can bid on auction
Auction can have a minimum price
Payment can be dealt with over email or on the website (not sure what is easiest)
Automated email response when bid has been placed
Every bid stored so if winner doesn't purchase next closes can try
Greatly appreciated,
I understand this may be a huge task - or may not be!
Yep, that's an entire project that people build entire businesses (and thus teams of developers) around.
Can it be done in a simpler fashion? Yes.
Can it be done well in a simple fashion? No.
I'll start throwing a range of things at you so that you can see how much is really involved in handling all of that. Just a glimpse.
You need hosting, there's always a fun start. Self hosting? Cloud hosting? AWS, Azure, other?
You need storage - some kind of database (though definition of such is fairly loose these days). For both user data and art/auction data.
You need to pick a back-end tech; Angular can't (or shouldn't) be the whole app. You should have something doing all of this coordinating, processing, saving, updating, deleting, etc. etc.
That's just the overview of the parts of the system as a whole, the two second version of your various points:
Admin users
Not just the ability for having users, but now you have different roles and thus different access rights and pages that are accessible depending. Though I note you use the phrase "with the specified password" which means you aren't really thinking of having an actual user-based system, just some brute-forceable basic hard-coded string put in there somewhere.
Note: especially if you're expecting to be dealing with money, this will not fly. Security is top of your list, so you need a full, proper, auth system in place.
Users can bid
Well I'd hope so, being the basis for this it seems. That's simple enough; requests through to the server to store relevant bids.
However, you need to worry about concurrency - how often will their page be updated so that they are aware of other bids? Does your page pole for changes every x seconds, or will you use websockets so that you can push updates to the clients in real time?
Min price
Yeah, sure, details details.
Payment
Over email? Outside of the odd £5 or something, people buying artworks are not likely to be happy randomly paying for things via email. Sure, maybe, depending on what you're doing and your audience, maybe that's fine. A proper website, however, will provide online payment options; that way buyers have confidence and some protection (against you, in this case).
Take payments? Two options, taking them yourself or via a third party. Doing it yourself is a pain in the backside and there's probably not much of a chance your project would pass the requisite security checks to be approved. So you'd need to work through a 3rd party - those are typically pretty easy to set up (they WANT you to use their services, after all), but it's still another chunk of work you need to do to integrate that.
Automated emails
Sure, emails, easy. Though you need to decide whom your provider is. Self hosted? SendGrid? AWS? etc.?
You'll need to keep an eye on your account every now and then as they tend to keep track of your 'reputation' and any bounces, etc. that you get, and if you drop too low, you get no service. E.g. prevent spammers.
Bids stored
Sure, everything should be stored. Bare minimum requirement.
That's just covering your points, let alone the effort of the unspoken requirements in creating an entire site that's acceptable for use and doesn't scare the users away two seconds after they look at it - it needs to look reputable.
You need registration pages, account management pages, galleries, search features (if you're that big?), admin pages (user management, auction management, etc), the lots themselves with the whole bidding mechanisms involved...
That said, you can probably put one together relatively quickly with various website builders - wordpress has a bajillion plugins and I'm sure probably runs half the websites out there at this point, and likely has everything needed to do that; but it wouldn't be Angular, and it wouldn't be your stuff; you're trusting whomever developed those things to have them work, maintain them, etc. etc. while you figure out how they all work and get it all running...
TL;DR: There's a reason devs are pretty well paid. It's always more complicated than you think.
Related
I'm creating a statistics web page which can see sensitive information.
The webpage has a sort of table which has massive data in it, editable and stored in Server's database. But It needs to be hidden before the user got proper authentications(Like log-in). (Table itself and it's code too). But I found that most of the questions in stack overflow say it is basically impossible. But when I see lots of well-known websites, it seems they are hiding them well. So I guess there are some solutions to the problem.
At first, I build a full-stack of React - Express - Node - MariaDB toolchain.
The react client is responsible for rendering contents of a webpage and editable tables and request for submitting edited content.
The node with express are responsible for retrieving data from DB, updating DB (Provides data to manipulate from client-side -- that's all)
It comes to a problem when I'm considering security on client-side code. I want to hide all content of the page (not just data from the server, but also its logic and features)
To achieving my goals, I consider several things, but I doubt if it is right and working well if I create.
Using Serverside rendering -- Cannot use due to performance reason and lack of resources available
Serverside rendering can hide logic from the user cause it omits the only HTML from the server and all actions are submitted to the server and the server handle the actions and provide its result.
So I can provide only the login page at first, and if login is successful, I can send the rest of HTML and it's logics from the server.
The problem is that my content in the webpage is massive and will be interacted with the user very often, and applying virtualization on my table (by performance reason), it's data and rendering logic should be handled by the web browser.
Combining SSR and Client-Side Rendering
My inspection for this is not sure, I doubt if it is possible.
Use SSR for hiding content of the site from unauthorized users, and if authorized, the web browser renders its full content on demand. (Code and logics should be hidden before authorization, the unauthorized user only can see the login page)
Is it possible to do it?
Get code on demand.
Also my inspection, this is what I am looking for. But I strongly doubt if it is possible.
Workflow is like below
If a user is not logged in:: User only can see the login page and its code
If the user is logged in:: User can see features of the page like management, statistics, etc.
If the user approaches specific features:: Rendering logic and HTTP request interface is downloaded from the server (OR less-performance hindering logic or else...) and it renders what users want to see and do.
It is okay not to find ways from the above idea. Can you provide some outlines for implement such kind of web page? I'm quite new to Web Programming, so I cannot find proper ways. I want to know how can I achieve this with what kinds of solutions, library, structure.
What lib or package should I use for this?
How can I implement this?
OR can you describe to me how modern websites achieve this? (I think the SAP system quite resembles with what I wanna achieve)
Foreword
Security is a complex topic, in which it is not possible to reach 0 threat. I'll try to craft an answer that could fullfil what you are looking for.
Back end: Token, credentials, authentication
So, you are currently using Express for your back end, hence the need to sort of protect the access from this part, many solution exist, I favor the token authentication, but you can do something with username/password (or this) to let the users access the back end.
From what you are describing you would use some sort of API (REST, GraphQL etc.) to connect to the back-end and make your queries (fetch, cross-fetch, apollo-link etc.) and add the token to the call to the back end in the headers usually.
If a user doesn't have the proper token, they have no data. Many sites use that method to block the consumption of data from the users (e.g. Twitter, Instagram). This should cover the security of the data for your back end, and no code is exposed.
Front-end: WebPack and application code splitting
Now the tricky part, so you want the client side not to have access to all the front-end at once but in several parts. This has 2 caveats:
It will be a bit slower than in normal use
Once the client logged in once, he will have access to the application
The only workaround I see in this situation is to use only server side rendering, if you want to limit to the bare minimum the amount of data the client has on your front end. Granted it is slow, but if you want maximum protection that is the only solution.
Yet, if you want to still keep some interactions and have a faster front end, while keeping a bit of security, you could use some code splitting with WebPack. I am not familiar with C so I can't say, but the Multiple page application of WebPack, as I was mentionning in the comment, should give you a good start to build something more secure.
First, you would have for example 2 html files for entering the front end: one with the login and one with the application. The login contains only the Javascript modules that are for entering the application and shouldn't load the other Javascript modules.
All in all, entrypoints are the way you can enter the application, this is a very broad topic that I can't cover in this answer, but I would recommend you to follow WebPack's tutorial and find out how you can work this out.
I recommend the part on code splitting, but all the tutorial is worth having a look.
Second, you will have to tweak the optimisation module. It is usually a module that tries to reduce the size of the application by merging methods that are used by different parts or that are redundant: you don't want this.
In your case, you don't want un-authenticated users to have access. So you would have to probably change things there (as well another broad topic to be covered in a single answer, since you would have to decide what you keep for optimisation and what you remove for security), but here is the link to the optimisation module and a heads up, you will have to modify the SplitChunksPlugin not to do this optimisation.
I hope this helps, there are many solutions are hand and this is not a comprehensive guide but that should give you enough materials to get to what you need.
Update: I am getting the impression that this is not even the right website to post this. If someone can point me in the right direction, I'd be appreciative...
I have an existing PHP+MySQL application that wasn't built to render "real-time" or similarly live-style data. But now I need to build in a way to pull nearly real-time data into the application and keep the data on the page fresh. This live data is only for 1 page in the application.
Looked at things like socket.io and PHP-based websockets libraries, but it seemed like overkill because the data is basically coming from 1 source and being delivered to 1 person (the client). Multiple other users could have this process running, but each one would bring their own data endpoint. That's... like a year down the road. But good to think about. Would ideally have hundreds, or thousands of users on the system, pulling their live-ish data. So I want this to be as streamlined and low-impact as possible.
Users must be authenticated and authorized to consume the data. This is already baked into the current system.
The API to get the data (which has already been built by another vendor) is also NOT streaming. It's set on a 20-second cron, so the new data is available every 20 seconds, which satisfies the client's needs.
My current plan is to do something like this...
Data is pulled on a cron every 20 seconds, organized, and stored into the database (complete)
Adjust #1 so it also does any additional proprietary calculations on data AND compiles + writes a JSON file on the server (unique to the user) which is the exact data needed for the front end (DB data is needed for other pages)
Create small PHP-based service which validates a client-provided JWT and reads the JSON file out
Write AJAX front end to poll endpoint from #3 every X seconds using a JWT for authorization
This all seems sort of like I might be reinventing the wheel, or missing something. The fact that this is an existing PHP based application (LAMP) does have some limiting factors, but I feel like there's got to be a more efficient way to handle this... It's pretty new to me. Also, I'm open to other technologies that'll run on the LAMP stack, if it'll make things better.
I would say go for the API solution in the beginning :) Since it fits the architecture more and is for sure the least amount of work. Also if there will be problem with the "live" feeling of the data you can fix it by polling more often or introducing long polling, assuming you change the cron job time.
I mean in the end it is all about impact for the time spent, don't start implement features that customers don't care about :)
The biggest problem to solve is to implement it in a way that fits your requirements and is somewhat future extendable. You still have to deal with issues like resolution, time outs, reducing server processing when requesting data and so on!
For me, if you need to maintain a global service state because a single client(s) request could affect all other connected client request(s) then most all server-side scripting languages are not the best choice! Also to further add, if you plan on implementing something like this with PHP, you will be setting your self up for a living nightmare! Why, because simply put, PHP(s) socket(s) implementation is that bad!
I've been tasked with creating a dynamic report builder to extend our current product that should allow our users to configure with relative ease a useful report drawing data from what they've inputted into the system. Currently we customize these reports manually, and this process involves a developer (me) taking the requirements of the report (fields, aggregate totals, percentages, etc) and publishing the results as a relatively interactive page that allows for the ability to 'drill down' for more information in the record rows, etc.
The reports are not extremely complicated, but they're involved enough that programmatically generating these reports doesn't seem possible. I feel like creating an interface to allow the users to customize the look of the report shouldn't be too difficult, though involved in and of itself. Where I am at a loss is how to create an interface that will allow users who have absolutely no 'programming' literacy the ability to easily generate the SQL queries that will pull the information they need.
In fact, they need to be able to create these queries and access the bowels of their inputted data without ever being aware of what they're really doing. I feel for this to work as required, the generation of the report has to be as indistinguishable from magic as possible. The user should be able to drag and drop what he/she needs from sets of possible data and magically produce an report.
I'm up for the challenge of course, but I really don't know where to start. Once I get the gears moving, resolving individual issues will be 'easy' ( well actually more like part of the process), but getting off the ground has been challenging and frustrating. If anyone can offer me a direction to search in, I'm not afraid of putting in the hours. Thank you for your time, and I look forward to some positive suggestions.
I was tasked with something like this before. My advice: don't. Unless the required reports are extremely basic and your users don't care about how the report looks, it'll take a significant amount of time to implement. With you indicating your a single person team, just don't. It'd be cheaper for you(even in the long run) to hire a junior developer or intern or something to handle this part of the job.
Now, there are a few different report designers out there. I've not seen any that work completely on a web page, and all of them sucked pretty bad from the non-programmer perspective.
Now, there are ways around this. Most of the people wanting these types of reports know how to work Microsoft Access. You can leverage their knowledge of this to let them create their own reports. This isn't trivial though as you don't want them just connecting to your database. So, here's what I recommend:
Generate a downloadable database compatible with Access
Ensure that the downloaded database is "easy" to work on. This means duplicating data and denormalizing a lot of things
Ensure you don't leave anything sensitive in the downloadable database (passwords, internal things they shouldn't see, etc)
And finally, ensure they can download it in a secure manner and that it's performant. You may need to tell your users the downloadable database is only "synced" once a week or month since it's relatively expensive to sync this in real-time
Have a look at what data warehouses do (e.g. The Data Warehouse Toolkit). They create several basic table that are very wide, contain a lot of redundant data and cover a certain aspect of the database.
I would create several such wide views and let the users select a single view as the basis for a dynamic report. They then can chose the columns to display, the sorting and grouping. But they cannot chose any additional tables or views.
Of course, a typical view must really cover everything regarding a certain aspect of the database. Let's assume you have an Order Items view. Such a view would contain all items of all orders offering hundreds of columns that cover:
The order product ID, product name, regular price, discount, paid price, price incl. the associated part of the shipping cost etc.
The order ID, order date, delivery date, the shipping cost etc.
The customer ID, customer name, customer address etc.
Each date consists of several columns: full date, day of year, month, year, quarter, quarter with year etc.
Each address consists the full address, the city, the state, the area, the area code etc.
That way, the dynamic reporting is rather easy to use because the users don't need to join any tables but have all the data they need.
I'd recommend you to look at ready reporting components. For example Microsoft's Reporting Services, Telerik, DevExpress or (I should confess, our product) SharpShooter Reports
Start looking around, what kind of reporting tools are there? Is there anything out there that even comes near what they are expecting?
The tools around will be generic and your case might be rather specific. You already know part of the answer your users are looking for.
Your solution should help them that way.
You used the word "magic". That should be a huge warning sign. As a developer, we don't do magic, we do logic. We can create illusions, we can't do magic.
I would dive into Sql Analysis Services and Excel.
There is a presentation over here. These guys don't do magic either, but they are able to do a lot.
We use a combination of EasyQuery and FastReport.NET.
EasyQuery allows our users to build dynamic queries and extract the data necessary for report and FastReport - for actual report generation and exporting it to Excel or PDF.
Take a look at zpmsoftware.com. It has an open source report builder for ASP.NET MVC and outputs to screen, excel and pdf. Not to much trouble to adapt to Webforms. Since most of the fancy stuff is in jquery/javascript, adapting to other server environments should be doable.
Imagine a space shooter with a scrolling level. What methods are there for preventing a malicious player from modifying the game to their benefit? Things he could do that are hard to limit server-side is auto-aiming, peeking outside the visible area, speed hacking and other things.
What ways are there of preventing this? Assume that the server is any language and that the clients are connected via WebSocket.
Always assume that the code is 100% hackable. Think of ways to prevent a client completely rewritten (for the purposes of cheating) from cheating. These can be things such as methods for writing a secure game protocol, server-side detection, etc.
The server is king. Clients are hackable.
What you want to do is two things with your websocket.
Send game actions to the server and receive game state from the server.
You render the game state. and you send input to the server.
auto aiming - this one is hard to solve. You have to go for realism. If a user hits 10 headshots in 10ms then you kick him. Write a clever cheat detection algorithm.
peeking outside the visibile area - solved by only sending the visible area to each client
speeding hacking - solved by handling input correctly. You receive an event that user a moved forward and you control how fast he goes.
You can NOT solve these problems by minifying code. Code on the client is ONLY there to handle input and display output. ALL logic has to be done on the server.
You simply need to write server side validation . The only thing is that a game input is significantly harder to validate then form input due to complexity. It's the exact same thing you would do to make forms secure.
You need to be really careful with your "input is valid" detection though. You do not want to kick/ban highly skilled players from your game. It's very hard to hit the balance of too lax on bot detection and too strict on bot detection. The whole realm of bot detection is very hard overall. For example Quake had an auto aim detection that kicked legitedly skilled players back in the day.
As for stopping a bots from connecting to your websocket directly set up a seperate HTTP or HTTPS verification channel on your multiplayer game for added security. Use multiple Http/https/ws channels to validate a client as being "official", acting as some form of handshake. This will make connecting to the ws directly harder.
Example:
Think of a simple multiplayer game. A 2D room based racing game. Upto n users go on a flat 2D platformer map and race to get from A to B.
Let's say for arguments sake that you have a foolsafe system where there's a complex authetication going over a HTTPS channel so that users can not access your websocket channel directly and are forced to go through the browser. You might have a chrome extension that deals with the authentication and you force users to use that. This reduces the problem domain.
Your server is going to send all the visual data that the client needs to render the screen. You can not obscure this data away. No matter what you try a silled hacker can take your code and slow it down in the debugger editing it as he goes along until all he's left with is a primitive wrapper around your websocket. He let's you run the entire authentication but there is nothing you can do to stop him from stripping out any JavaScript you write from stopping him doing that. All you can achieve with that is limit the amount of hackers skilled enough of accessing your websocket.
So the hacker now has your websocket in a chrome sandbox. He sees the input. Of course your race course is dynamically and uniquely generated. If you had a set amount of them then the hacker could pre engineer the optimum race route. The data you send to visualise this map can be rendered faster then human interaction with your game and the optimum moves to win your racing game can be calculated and send to your server.
If you were to try and ban players who reacted too fast to your map data and call them bots then the hacker adjusts this and adds a delay. If you try and ban players who play too perfectly then the hacker adjusts this and plays less then perfect using random numbers. If you place traps in your map that only algorithmic bots fall into then they can be avoided by learning about them, through trial and error or a machine learning algorithm. There is nothing you can do to be absolutely secure.
You have only ONE option to absolutely avoid hackers. That is to build your own browser which cannot be hacked. Build the security mechanisms into the browser. Do not allow users to edit javascript at runtime in realtime.
At the server-side, there are 2 options:
1) Full server-side game
Each client sends their "actions" to the server. The server executes them and sends relevant data back. e.g. a ship wants to move north, the server calculates its new position and sends it back. The server also sends a list of visible ships (solving maphacks), etcetera.
2) Full client-side game
Each client still sends their actions to the server. But to reduce workload on the server, the server doesn't execute the actions but forwards them to all other clients. The clients then resolve all actions simultaneously. As a result, each client should end up with an identical game. Periodically, each client sends their absolute data (ship positions, etc.) to the server and the server checks if all client data is identical. Otherwise, the games are out of sync and someone must be hacking.
Disadvantage of the second method is that some hacks remain undetected: A maphack for example. A cheater could inject code so he sees everything, but still only sends the data he should normally be able to see to the server.
--
At the client-side, there is 1 option:
A javascript component that scans the game code to see if anything has been modified (e.g. code modified to render objects that aren't visible but send different validation data to the server).
Obviously, a hacker could easily disable this component. To fix that, you could force the client to periodically reload the component from the server (The server can check if the script file was requested by the user periodically). This introduces a new problem: the hacker simply periodically requests the component via AJAX but prevents it from running. To avoid that: have the component redownload itself, but a slightly modified version of itself.
For example: have the component be located at yoursite/cheatdetect.js?control=5.
The server will generate a slightly modified cheatdetect.js so that in the next iteration, cheatdetect.js?control=22 (for example) must be downloaded. If the control mechanism is sufficiently complicated, the hacker won't be able to predict which control number to request next, and cheatdetect.js must be executed in order to continue the game.
There's nothing you can really do to prevent anyone from modifying your JS or writing a GreaseMonkey script. However you can make it hard for them by minifying your script as well as making your code as cryptic as possible. Maybe even throwing in some fake methods or variables that do nothing but are used to throw an attacker off. But given enough time, none of these methods are completely foolproof, as once your code goes to the client, it is no longer yours.
The only way I can even think of implementing this is by modifying your Javascript to function as a client and then designing a central server mechanism to validate data sent from that client. This is probably a big change to implement and will most likely make your project more complex. However, as was said earlier, if the application runs entirely on the client, the client can pretty much do whatever they want with your script. The only way to secure it to use a trusted machine to handle validation.
They don't have to touch your client-side code -- they could just sniff and implement your Websocket protocol and write a tiny agent that pretends to be a human player.
Update: The problem has a few parts, and I don't have answers off the top of my head, but the various options could be evaluated with these questions in mind:
How far are you willing to go to prevent cheating? If you only care about casual cheating, how many barriers are enough to discourage the casual cheater? The intermediate Javascript programmer? A serious expert? Weighing this against the benefits of cheating, is there anything of real value at stake, like cash and prizes, or just reputation?
How do you get a high confidence that a human is providing inputs to your game? For example, with a good enough computer vision library I could model your game on a separate machine feed inputs to the computer pretending to be the mouse, but this has a high relative cost (not worth my time).
How can you create a chain of trust in your protocol such that knowledge of (2) can be passed to the server, and that your server is relatively confident your client code is sending the messages?
Sure many of the roadblocks you throw up can be side-stepped, but what is the cost to the player and you? See "Attrition warfare".
Some other methods that can be implemented:
Make the target elements difficult for a script to distinguish from other elements. Avoid divs with predictable class and id names if possible. Inject styling using JavaScript instead of using classes. Think like a hacker and make it hard on yourself.
Use decoys that a script will fire on. For instance, if the threat vector is a screen scraping algorithm using pixel colors, throw some common pixel colors in non-target elements. Hits on these non-targets could seem inconsequential to the cheater, but would be detectable. You don't want the cheater to know why you know.
Limit the minimum time between actions to slightly below the best human levels. The best players will hit that plateau, and it won't matter as much who's cheating, and immediately be able to detect anyone scripting faster than that by side-calling method calls.
Random number generators are typically uniform. Human nature is not. Likely a random number generator will have values within a set limit and even distribution. Natural distribution is a Gaussian curve. If you sampled the distribution and it looks like a square wave in the x and y axis, 100% it's a cheater. This will be fairly difficult for the cheater to detect the threshold for the algorithm because it's a derivative of the random, and not the random distribution itself. You're also using aggregate data and not individual plays to detect it, so reverse engineering the algorithm would be extremely difficult without knowing your detection algorithm.
Utilize entropy whenever possible. Avoid predictable game plays. Imagine a racing game on a set collection of race tracks. Each game play could have slightly differing levels of traction, horsepower, and momentum. The script would have to be extremely good to beat it. In a scrolling game, you can alter factors that are instinctual to humans, but difficult for computers, such as wind force, changes in gravity, etc. It would also make it more fun as a side benefit.
Server generated tokens can be used to validate UI elements were used and not calls to the code itself. Validation can be handled in one call at the end of the game comparing events to hashed codes of UI elements. The token should be a hash with a server private key and some value of the UI element.
Decoy the cheater with data they think you're using to detect cheats. Such as calls to a DetectCheat method with dummy calls to a fake backend. It's the old magician's trick. Wave your hand over here, while you slip a card into the deck with the other hand. Let them waste days on end in a maze that has no exit, with lot's of hair pulling.
I'd use a combination of minification and AJAX. If all of the functions and data aren't loaded into the page, it'd be more difficult to cheat.
On the other hand, modding turned out to be a very profitable tool for companies like Id Software. Perhaps allowing the system to be modded might make the game that much more enjoyable to the community at large.
Obfuscate your client exposed code as much as possible. Additionally, use some magic.
You can edit the javascript on the browser and make it work.
Some people suggest that make a call to check with the server. So after making a call to the server, it will be validated in the server. Once validated, it will come to client side and do actions. But I think even this is not foolproof.
For eg.,. for a Basic login action : in angular while making a call to server, the backend validates username & pwd and if validated, it will come back to the client and let the user login using angular.
When I say login using angular, it is going to store things in cookies, like user objects and other things. But still the user can remove the JS code which is making the call to backend, and return TRUE(wherever needed) and insert user object(dummy) to cookies and other objects(whatever needed) and login. It is a very difficult thing to do, but it is doable. In many scenarios, this is not desirable even if it takes hours to edit/hack the code.
This is possible in single page applications, where JS files dont get reloaded for each page. To mitigate the possibility of getting hacked we can use minified codes. And I guess if actions like this is done in backend(like login in Django) it is much safer.
Please correct me if I am wrong.
I came across a site that does something very similar to Google Suggest. When you type in 2 characters in the search box (e.g. "ca" if you are searching for "canon" products), it makes 4 Ajax requests. Each request seems to get done in less than 125ms. I've casually observed Google Suggest taking 500ms or longer.
In either case, both sites are fast. What are the general concepts/strategies that should be followed in order to get super-fast requests/responses? Thanks.
EDIT 1: by the way, I plan to implement an autocomplete feature for an e-commerce site search where it 1.) provides search suggestion based on what is being typed and 2.) a list of potential products matches based on what has been typed so far. I'm trying for something similar to SLI Systems search (see http://www.bedbathstore.com/ for example).
This is a bit of a "how long is a piece of string" question and so I'm making this a community wiki answer — everyone feel free to jump in on it.
I'd say it's a matter of ensuring that:
The server / server farm / cloud you're querying is sized correctly according to the load you're throwing at it and/or can resize itself according to that load
The server /server farm / cloud is attached to a good quick network backbone
The data structures you're querying server-side (database tables or what-have-you) are tuned to respond to those precise requests as quickly as possible
You're not making unnecessary requests (HTTP requests can be expensive to set up; you want to avoid firing off four of them when one will do); you probably also want to throw in a bit of hysteresis management (delaying the request while people are typing, only sending it a couple of seconds after they stop, and resetting that timeout if they start again)
You're sending as little information across the wire as can reasonably be used to do the job
Your servers are configured to re-use connections (HTTP 1.1) rather than re-establishing them (this will be the default in most cases)
You're using the right kind of server; if a server has a large number of keep-alive requests, it needs to be designed to handle that gracefully (NodeJS is designed for this, as an example; Apache isn't, particularly, although it is of course an extremely capable server)
You can cache results for common queries so as to avoid going to the underlying data store unnecessarily
You will need a web server that is able to respond quickly, but that is usually not the problem. You will also need a database server that is fast, and can query very fast which popular search results start with 'ca'. Google doesn't use conventional database for this at all, but use large clusters of servers, a Cassandra-like database, and a most of that data is kept in memory as well for quicker access.
I'm not sure if you will need this, because you can probably get pretty good results using only a single server running PHP and MySQL, but you'll have to make some good choices about the way you store and retrieve the information. You won't get these fast results if you run a query like this:
select
q.search
from
previousqueries q
where
q.search LIKE 'ca%'
group by
q.search
order by
count(*) DESC
limit 1
This will probably work as long as fewer than 20 people have used your search, but will likely fail on you before you reach a 100.000.
This link explains how they made instant previews fast. The whole site highscalability.com is very informative.
Furthermore, you should store everything in memory and should avoid retrieving data from the disc (slow!). Redis for example is lightning fast!
You could start by doing a fast search engine for your products. Check out Lucene for full text searching. It is available for PHP, Java and .NET amongst other.