Today I’m going to write about how a viewer logs in to an OpenSimulator grid. This is a considerably more complicated process than a simple website login. If you ever need to find out why login isn’t working it really helps to know what’s going on under the hood.
For simplicity’s sake, we’ll look at the standalone case, where all the regions and grid services are running under a single OpenSim.exe process.
Step 1: As the first step for logging in, the viewer sends an XMLRPC message to the Login Service URI containing the name, password, viewer version and other details. The -loginuri parameter on the command line that tells the viewer where the login service is (here it’s 192.168.1.2 and the port number is 9000). In viewers such as Imprudence the viewer’s grid manager can fetch this information from the grid info XML that the grid advertises at a well known URL.
Step 2: The login service uses these details to check that the password is correct. If it is then it looks up the simulator that the user should be placed in (e.g. last or home). In this case there’s only one region called “My Region”. The login service will generate a random ‘circuit code’ and ask the simulator to record this code and set up an ‘agent’ in My Region. The agent represents a user in an OpenSim region.
Step 3: Once the simulator has generated the agent, it returns a randomly generated ‘circuit code’ to the login service. The login service will package this up together with the IP and port address of the region and return it to the viewer in a reply XMLRPC message. The login service gets the region’s IP address and port from the ExternalHostName and InternalPort entries in the bin/config-include/Regions.ini file (config files there can also have other names as long as they end with .ini). In this case the entry is
[My Region] RegionUUID = dd5b77f8-bf88-45ac-aace-35bd76426c81 Location = 1000,1000 ... InternalPort = 9000 ExternalHostName = 192.168.1.2
The host name here is 192.168.1.2 (same as the login service since we’re on a standalone) and the internal port is 9000. But in this case it specifies the port that the client should use for UDP messages between itself and the simulator (we’ll ignore HTTP capabilities in this post). The ExternalHostName can also be SYSTEMIP, in which case the default IP of the machine hosting the simulator is used (which would also be 192.168.1.2).
Step 4: When the viewer receives the XMLRPC reply, it extracts the circuit code, simulator ip and port. To make first contact with the simulator, it sends it a UseCircuitCode UDP message containing the circuit code. The simulator compares this against the circuit code that the Login Service gave it for that client. If they match then the simulator sends back and Ack packet and the two start talking to each other (i.e. the simulator gets sent terrain, object and texture information and can move around the region). If they don’t match then the simulator logs a warning and ignores the UseCircuitCode message.
Whew, quite a process, eh? As you can imagine, there’s a lot that can go wrong. Let’s go through the possible problems.
Viewer has wrong loginuri
This is an easy one. If the viewer is trying to login with the wrong uri (e.g. 192.168.1.3 in the example above) or wrong port (e.g. 9001) then you’ll get something like an “Unable to connect to grid” – nothing will ever reach the Login Service.
Viewer has wrong login credentials
Another easy one. The Login Service will reject the credentials and tell the viewer, which will display a “Could not authenticate your avatar” message.
A firewall prevents the Login Service from replying to the viewer
In this case the viewer can send some initial TCP packets to the Login Service but can’t get anything back. As above, the viewer will present an “Unable to connect to grid” message but this time after a longer pause until it times out on the Login Service connection.
Viewer receives misconfigured external host name from Regions.ini
Now it gets more complicated. Suppose that instead of putting 192.168.1.2 in the My Region config I accidentally put 192.168.1.3 instead, which is a machine that doesn’t exist on my network.
[My Region] RegionUUID = dd5b77f8-bf88-45ac-aace-35bd76426c81 Location = 1000,1000 ... InternalPort = 9000 ExternalHostName = 192.168.1.3
In this case, the first part of the login process works okay and the progress bar moves along in the viewer. But when the Login Service returns the simulator information to the viewer, it returns the ExternalHostName of 192.168.1.3 instead of 192.168.1.2. The viewer will make a number of attempts to contact this non-existent simulator for the second part of the login, and so appear to hang for a while on a message such as “Waiting for region handshake…” before failing with a “We’re having trouble connecting…”
In this case, since 192.168.1.3 has no machine a simple ping will reveal the mistake. If there is a machine at that address or it’s the port number that is wrong then things are more complicated. It’s difficult to diagnose problems here since UDP messages are connectionless, unlike TCP. If you have a utility like netcat available on the viewer machine, you can try sending nonsense to the address and port given in Regions.ini. For instance, above we could doecho “foo” | nc -u 192.168.1.2 9000
and the simulator would print out a “Malformed data” message.
Viewer can’t reach region external host name
Now let’s suppose that the ExternalHostName and InternalPort are correct, but the viewer can’t reach that address for some reason (e.g. UDP messages to that port are blocked by a firewall). You’ll see exactly the same symptoms as if the host name is misconfigured. The diagnostics are also the same, with the addition that you need to thoroughly check your firewall and other network settings.
You can also see this if you’ve specified a public IP for ExternalHostName but you’re attempting a connection from within your LAN and your router does not support NAT loopback. The easiest solution is to get a router that does support NAT loopback though you might also want to try the workarounds listed on that wiki page.
A firewall prevents the simulator from replying to the viewer
Unlike the firewall blocking the login service reply above, this time the first part of the login process will complete correctly and the simulator will even receive the UseCircuitCode message. However, the Ack that it replies with (and any other UDP messages) is blocked by a firewall. In the simulator log you will see messages such as
[LLUDPSERVER]: Ignoring a repeated UseCircuitCode from 2c3b8307-e257-4d1e-b12f-76f2b8f50ee9 at 192.168.1.3:1208 for circuit 546230463
as the viewer resends the UseCircuitCode packet another 3 times (while it displays the “Waiting for region handshake…” message. Eventually, the viewer gives up and displays the “We’re having trouble connecting…” message. In this case, you need to carefully check that your firewall allows outbound UDP messages from the simulator to the viewer’s IP address.
As you can see, the login process is complicated. Much of this complexity exists so that in grid mode simulators can be hosted on different machines to the login service.
In grid mode, all of the above information still applies, with the addition that the login service and simulators communicate over a network rather than within a single process. This is another point of failure. If there’s a problem here then you should see an error in the login service log and the viewer will return with a “Unable to connect to grid” message.
Hi folks. For the past couple of years, as well as working for IBM and then on various OpenSim-related jobs, I’ve been doing a part-time Masters in Software Engineering at the University of Oxford. As part of this degree, as well as completing assignments for various taught courses, the student has to conduct an independent project and write a dissertation about it.
Naturally, I chose virtual environments/worlds as my project area, with OpenSim as the chief exemplar :-). More specifically, I decided to do some thinking about the possible architectures for creating a truly decentralized Internet-Scale virtual environment, as opposed to the classic ‘grid’ OpenSim/Second Life model where simulators are distributed but services (assets, inventory, users) are centralized. Naturally, initiatives such as Crista Lopes’ Hypergrid play a large part in these considerations.
The initial part is devoted to describing the classic ‘grid’ model, both conceptually (e.g. through the lens of the dimensions of transparency for distributed systems defined by the ISO International Standard on Open Distributed Processing) and in formal Z notation. The second part of the dissertation takes this description and compares it against what I think are the requirements for a truly Internet-scale virtual environment network. The last part of the text explores alternative architectures to the classic grid model for getting to Internet-scale.
Some of the models are significantly simplified from their real-life implementations. For instance, the OpenSim model presented consists of only four services (asset, inventory, user and grid). The Hypergrid model is based on Hypergrid 1.0 rather than the more recent revisions. In the main, these simplifications were made for clarity of argument whilst hopefully leaving the critical architectural features intact.
The dissertation also doesn’t take into account approaches that are radically different from OpenSim’s client-server system, such as OpenCroquet’s peer to peer synchronization architecture. I’d loved to have written and thought more about this but I simply ran out of space and time.
Nonetheless, I hope this dissertation might prove useful to people thinking about the future of Internet-Scale virtual environments, if only as a springboard for their own explorations. Naturally, I’d be very interested in any comments or questions that people might have.
p.s. For anybody wondering, this final bit of work was enough to secure me a pass with distinction ;-).
What is the Hypergrid? December 19, 2008Posted by justincc in opensim, opensim-arch, opensim-dev, opensim-grid, opensim-news.
The Hypergrid is a neural-interactive simulation, a computer generated dreamworld built to keep you under control. It is the world that has been pulled over your eyes, to blind you from the truth.
The Hypergrid has you…
Or to put it another way :), the Hypergrid is a new core OpenSim network architecture which joins the existing standalone and grid architectures. I’ve mentioned it before in various This Week In OpenSim Dev posts and there is a good OpenSim wiki page on it as well as a blog post by Rock. However, I thought that I would give my own summary of it. This is an extension of my original technical analysis of Diva’s initial Hypergrid patch as posted to the Opensim-dev mailing list.
No really, what is the Hypergrid?
Essentially, a Hypergrid is a confederation of OpenSim systems that have enabled the Hypergrid facility. Each user has a home grid or standalone where their user profile, avatar appearance and inventory is stored. Possible homes range from a standalone on the user’s own machine up to a large OpenSim grid hosted by a third party. Users can travel from their home to a different grid or standalone via a hyperlink (set up via some funky map and region handle manipulation). When they arrive at the foreign grid, they carry with them the url of their home asset and inventory services. This mean that in the Hypergrid
- If a user rezzes an object from their inventory, the assets for that object are fetched from the home asset service and permanently inserted into the foreign asset service. So when that user goes away or logs off, the assets are still available to be seen by everybody else.
- If a user copies/takes an object from a foreign grid, then the relevant inventory and asset data gets sent to their home inventory and asset services.
In other words, the necessary information to rez inventory and avatar appearance is transparently passed between linked grids as necessary. This allows a user to hop around different Hypergrid enabled grids and standalones as if they were travelling around a single system.
Thus, on a conceptual level, some of the pros of the Hypergrid are:
- It effectively distributes asset and inventory load over multiple services on multiple grids. This is a really good alternative to scaling up a central service to internet scale.
- It allows grids to seamlessly link to others yet retain control over their own services.
And the cons:
- In the Hypergrid, assets, including scripts, are liberally spread around grids. If someone travels to your Hypergrid enabled OpenSim and acquires an object, then they get a copy of all those assets in their home services (which may be situated on their own machine).
- Regions must be manually linked and appear in a grid’s map. One can’t just enter an address in a url bar to go to another grid, the grid owner must set up the link. As far as I can see this is a limitation of what we have to work with in the Linden Labs Second Life client. There may be some arguments for restriction (you don’t want someone coming to your pg grid from an adult grid and depositing god knows what).
- Home services need to be exposed to foreign linked grids. So malicious grids could possibly fetch and put things they shouldn’t. There are already some proposals for dealing with this. However, this may also be another good argument for controlling who can link to you, for now.
Some current issues
I think there are some issues with the current Hypergrid code. These aren’t fundamental architectural concerns.
- Assets associated with worn attachments and appearance are not uploaded to a foreign grid from the home grid on teleport in. For other users without cache copies, such avatars will always appear gray and I don’t think that any attachments would appear. This is fixable.
- Prim inventory inspection does not go deep enough. As far as I can see, in the ‘asset mapper’ you look for contained textures when rezzing an item, but not any other contained assets (including contained objects, clothing, notecards, scripts, etc.). This will result in an incomplete rez. However, this is fixable. Indeed, I’ve already written all the code required to do this for OpenSim Archive support (OARs).
In my opinion, the Hypergrid is a very promising architectural direction for OpenSim. It moves from a system of centralized services to one a user can seamlessly navigate between many different grids whilst sourcing their appearance and inventory from their own home services. This decentralized is a commonly hoped for change that I’ve written about previously, as have others. Though there are quite a few problems to resolve yet (such as the short term technical ones that I’ve talked about, as well as more fundamental questions such as that of grid service security), I think that the Hypergrid has a lot of potential.
If you want to set up your own Hypergrid enabled OpenSim, there are instructions on the OpenSim wiki, as well as a list of Hypergrid enabled OpenSim instances. You will need the owner’s permission to establish a link. I believe that Wright Plaza on OSGrid is also Hypergrid enabled, though it isn’t yet listed.
Many thanks must go towards Diva (Christa Lopes) who did all the hard work in formulating the architecture and writing the code such that a patch could be accepted into OpenSim. The OpenSim community (both developers and users) also deserves a great deal of thanks, since their hard work and enthusiasm was absolutely instrumental in proving that the concept was viable.
(The image used at the top of this post is licensed under Attribution-Share Alike 2.5 from the OpenSim wiki)