Today I’m going to write about how a viewer logs in to an OpenSimulator grid. This is a considerably more complicated process than a simple website login. If you ever need to find out why login isn’t working it really helps to know what’s going on under the hood.
For simplicity’s sake, we’ll look at the standalone case, where all the regions and grid services are running under a single OpenSim.exe process.
Step 1: As the first step for logging in, the viewer sends an XMLRPC message to the Login Service URI containing the name, password, viewer version and other details. The -loginuri parameter on the command line that tells the viewer where the login service is (here it’s 192.168.1.2 and the port number is 9000). In viewers such as Imprudence the viewer’s grid manager can fetch this information from the grid info XML that the grid advertises at a well known URL.
Step 2: The login service uses these details to check that the password is correct. If it is then it looks up the simulator that the user should be placed in (e.g. last or home). In this case there’s only one region called “My Region”. The login service will generate a random ‘circuit code’ and ask the simulator to record this code and set up an ‘agent’ in My Region. The agent represents a user in an OpenSim region.
Step 3: Once the simulator has generated the agent, it returns a randomly generated ‘circuit code’ to the login service. The login service will package this up together with the IP and port address of the region and return it to the viewer in a reply XMLRPC message. The login service gets the region’s IP address and port from the ExternalHostName and InternalPort entries in the bin/config-include/Regions.ini file (config files there can also have other names as long as they end with .ini). In this case the entry is
[My Region] RegionUUID = dd5b77f8-bf88-45ac-aace-35bd76426c81 Location = 1000,1000 ... InternalPort = 9000 ExternalHostName = 192.168.1.2
The host name here is 192.168.1.2 (same as the login service since we’re on a standalone) and the internal port is 9000. But in this case it specifies the port that the client should use for UDP messages between itself and the simulator (we’ll ignore HTTP capabilities in this post). The ExternalHostName can also be SYSTEMIP, in which case the default IP of the machine hosting the simulator is used (which would also be 192.168.1.2).
Step 4: When the viewer receives the XMLRPC reply, it extracts the circuit code, simulator ip and port. To make first contact with the simulator, it sends it a UseCircuitCode UDP message containing the circuit code. The simulator compares this against the circuit code that the Login Service gave it for that client. If they match then the simulator sends back and Ack packet and the two start talking to each other (i.e. the simulator gets sent terrain, object and texture information and can move around the region). If they don’t match then the simulator logs a warning and ignores the UseCircuitCode message.
Whew, quite a process, eh? As you can imagine, there’s a lot that can go wrong. Let’s go through the possible problems.
Viewer has wrong loginuri
This is an easy one. If the viewer is trying to login with the wrong uri (e.g. 192.168.1.3 in the example above) or wrong port (e.g. 9001) then you’ll get something like an “Unable to connect to grid” – nothing will ever reach the Login Service.
Viewer has wrong login credentials
Another easy one. The Login Service will reject the credentials and tell the viewer, which will display a “Could not authenticate your avatar” message.
A firewall prevents the Login Service from replying to the viewer
In this case the viewer can send some initial TCP packets to the Login Service but can’t get anything back. As above, the viewer will present an “Unable to connect to grid” message but this time after a longer pause until it times out on the Login Service connection.
Viewer receives misconfigured external host name from Regions.ini
Now it gets more complicated. Suppose that instead of putting 192.168.1.2 in the My Region config I accidentally put 192.168.1.3 instead, which is a machine that doesn’t exist on my network.
[My Region] RegionUUID = dd5b77f8-bf88-45ac-aace-35bd76426c81 Location = 1000,1000 ... InternalPort = 9000 ExternalHostName = 192.168.1.3
In this case, the first part of the login process works okay and the progress bar moves along in the viewer. But when the Login Service returns the simulator information to the viewer, it returns the ExternalHostName of 192.168.1.3 instead of 192.168.1.2. The viewer will make a number of attempts to contact this non-existent simulator for the second part of the login, and so appear to hang for a while on a message such as “Waiting for region handshake…” before failing with a “We’re having trouble connecting…”
In this case, since 192.168.1.3 has no machine a simple ping will reveal the mistake. If there is a machine at that address or it’s the port number that is wrong then things are more complicated. It’s difficult to diagnose problems here since UDP messages are connectionless, unlike TCP. If you have a utility like netcat available on the viewer machine, you can try sending nonsense to the address and port given in Regions.ini. For instance, above we could doecho “foo” | nc -u 192.168.1.2 9000
and the simulator would print out a “Malformed data” message.
Viewer can’t reach region external host name
Now let’s suppose that the ExternalHostName and InternalPort are correct, but the viewer can’t reach that address for some reason (e.g. UDP messages to that port are blocked by a firewall). You’ll see exactly the same symptoms as if the host name is misconfigured. The diagnostics are also the same, with the addition that you need to thoroughly check your firewall and other network settings.
You can also see this if you’ve specified a public IP for ExternalHostName but you’re attempting a connection from within your LAN and your router does not support NAT loopback. The easiest solution is to get a router that does support NAT loopback though you might also want to try the workarounds listed on that wiki page.
A firewall prevents the simulator from replying to the viewer
Unlike the firewall blocking the login service reply above, this time the first part of the login process will complete correctly and the simulator will even receive the UseCircuitCode message. However, the Ack that it replies with (and any other UDP messages) is blocked by a firewall. In the simulator log you will see messages such as
[LLUDPSERVER]: Ignoring a repeated UseCircuitCode from 2c3b8307-e257-4d1e-b12f-76f2b8f50ee9 at 192.168.1.3:1208 for circuit 546230463
as the viewer resends the UseCircuitCode packet another 3 times (while it displays the “Waiting for region handshake…” message. Eventually, the viewer gives up and displays the “We’re having trouble connecting…” message. In this case, you need to carefully check that your firewall allows outbound UDP messages from the simulator to the viewer’s IP address.
As you can see, the login process is complicated. Much of this complexity exists so that in grid mode simulators can be hosted on different machines to the login service.
In grid mode, all of the above information still applies, with the addition that the login service and simulators communicate over a network rather than within a single process. This is another point of failure. If there’s a problem here then you should see an error in the login service log and the viewer will return with a “Unable to connect to grid” message.
1 comment so far
Hi folks. At the weekly OpenSim/OSGrid development meeting last Tuesday, Nalates Uriah relayed that Linden Lab are planning some fundamental changes to how mesh is handled in Second Life. Instead of uploading a mesh as an asset and applying it to an existing prim, it sounds like a mesh will be treated as a scene object in its own right. Aspects of the mesh data format will also change.
When Linden Lab releases new viewers with these changes they won’t be able to see any of the meshes previously uploaded to the Linden Lab beta grid. The same will be true for OpenSimulator.
OpenSimulator will need changes to work with the new mesh objects and data. As it is, OpenSimulator treats uploaded mesh data as an opaque blob which it simply stores as an asset. The asset id is placed in the SculptTexture property of the PrimitiveBaseShape of a SceneObject or in the assetID slot of an InventoryItemBase. When a client fetches such a scene object or inventory item, it separately requests the asset data through the GetMesh capability – OpenSim doesn’t parse the asset data at all.
This simplicity means that that old mesh data might continue to work with old mesh viewers (this is not a guarantee). But of course, if you use the old viewers then you won’t be able to see the newer mesh objects. It’s possible that a third party viewer could implement both the old mesh approach and the new approach. However, I think this is very unlikely, particularly if the new changes are due to deficiencies in the existing approach.
We won’t know what changes are required in OpenSimulator until the new mesh approach is public. In the best case, the asset data can be left opaque for now and some new properties added to scene objects. In the worst case, the asset data itself will require extensive parsing but I suspect that this will not be necessary.
So in short, I strongly recommend that you don’t rely on any mesh data that you upload to OpenSimulator until the new mesh changes are implemented. In fact, I would recommend waiting until mesh is in public use on the Linden Lab grid, in case further changes need to occur down the road.
Hi folks. Just a brief blog post to remind people to take care when experimenting with Linden Lab’s Beta Second Life Viewer 2 with the current OpenSim builds. I’ve seen mixed reports of how well it works with OpenSim at the moment (some people seem to suffer a crash almost immediately while others seem to be able to use it to some extent).
However, according to John Hurliman the version of the OpenMetaverse library currently being used by OpenSim has a problem dealing with at least one of the new inventory packets in the 2.0 Viewer. This means that it’s not impossible that using the viewer with OpenSim today could inadvertently destroy some of your inventory items.
This is certainly not deliberate in any way – it would be a result of the mismatch between the viewer’s expectations of the server-viewer communications protocol and OpenSim’s (via libOpenMetaverse).
John says there are also issues with the viewer no longer shipping with default terrain textures, making OpenSim regions using those textures appear all white (if only it were still Christmas). So it sounds like experimentation is best done with an alt avatar for now.
The viewer itself looks very interesting and I look forward to trying it out when I next get an opportunity. Not sure how long it will take to get support in OpenSim but I wouldn’t be at all surprised if people aren’t already working on it. John himself says he’s looking to get various issues resolved though it’s not at the top of his priority stack at the moment.