There are many advantages to running Perforce Helix on Linux servers. See the process and pitfalls encountered when converting a distributed Perforce infrastructure from Windows to Linux.
What Goes Wrong with Language Definitions and How to Improve the Situation
From Windows to Linux: Converting a Distributed Perforce Helix Infrastructure
1. From Windows to Linux:
Converting a Distributed
Perforce Helix
Infrastructure
David Foglesong
Senior Systems Software Engineer
Tableau Software
2. 2
About Tableau Software
Tableau Software (NYSE: DATA) helps people see and
understand data. Tableau helps anyone quickly analyze,
visualize and share information. More than 35,000 customer
accounts get rapid results with Tableau in the office and on-
the-go. And tens of thousands of people use Tableau Public to
share data in their blogs and websites. See how Tableau can
help you by downloading the trial at www.tableau.com/trial.
3. 3
Perforce At Tableau
Started with CVS.
Converted to Perforce in 2007 (on Windows).
Infrastructure has evolved over time: One server > commit
+ replicas > commit + replicas + edge servers.
650K+ changes, 900+ users, commit = 500G DB, edge =
800G DB.
3 main dev offices, several smaller offices.
4. 4
Why Change?
Tableau product change from Win-only to Win/Mac to
Win/Mac/mobile (iOS/Android).
Run other parts of build infrastructure (Artifactory,
ReviewBoard, OpenGrok) on Linux.
Dev management wants “best” systems = performance,
stability, scalability.
Perforce admins prefer Linux.
5. 5
Before
1x commit server
1x RO replica (backup and reporting)
3x fwd replica (at main dev offices)
Proxies at smaller dev offices.
1x edge server (for build farm) + RO replica (backup)
Brokers in front of everything.
All on Windows.
7. 7
After
1x commit server
2x RO replicas (backup and reporting)
5x edge servers (3x for offices, 2x for build farm)
5x RO replicas (backups for edge servers)
Brokers in front of everything.
All on Linux.
Most in central data center.
Proxies in dev offices.
9. 9
Edge Servers vs. Forwarding Replicas
Why switch offices to edge servers?
• Remote users (Palo Alto) see small lag on commands.
• Want to move most user data (db.have) off commit server – goal is to
have minimal (ideally none) connections from human users on commit
server.
Move to cluster?
• Not at this time. Want to preserve option of moving edge server to dev
office if needed.
10. 10
Process: Old vs. New
Old way = checkpoint manipulation
• Either via scripts or p4migrate.
• Need to fix case inconsistencies in metadata. E.g.,
- //depot/source/foo.c#1 vs. //depot/SOURCE/bar.c#1
- User “bob” vs. user “BOB”.
• Transfer and convert archive files.
11. 11
Process: Old vs. New
New way = replication
• Create Linux replicas and edge servers connected to Windows commit
server.
• Use “verify –t” to transfer and convert archive files.
• Once everything but commit server is on Linux, failover commit server to
Linux RO replica.
14. 14
Challenges – RHEL7
IT standardized on RHEL7.
RHEL7 uses systemd (vs. init scripts) to control services.
Worked with John Halbig in support to create service unit
files for p4d (and p4p and p4broker), he created KB with
sample: http://answers.perforce.com/articles/KB/10832
15. 15
Challenges – CVS2P4
Original Perforce system created with cvs2p4 script.
cvs2p4 works by using CVS change history to create
checkpoint to build DB files.
The db entries refer to original CVS (RCS) archive files
stored in special “import” dir.
CVS/RCS stores branches in the archive files, with revs like
1.1.1.1, but P4 uses separate files for branches.
16. 16
Challenges – CVS2P4
P4D resolves the CVS 1.1.1.1 branch revs OK, but “verify –t”
wouldn’t transfer them.
Solution = Use smbclient + dos2unix to transfer the cvs2p4
import dir.
This brought over ~70K changes, then used “verify –t” to
get rest.
17. 17
Challenges – Platform specific entries
Because we need to run in a mixed Win/Linux environment
during transition, can’t have platform specific entries
anywhere.
Specific examples:
• Depot dir location.
• Trigger table entries.
18. 18
Challenges – Depot dir location
Win servers had “flat” P4ROOT layout where depot dirs were
nested in D:p4d dir. (e.g., D:p4ddepot)
Short path length was partially to mitigate 260 char Win
max path issue.
Didn’t want depots in P4ROOT on Linux, so set
“server.depot.root” configurable to put dirs in other location.
19. 19
Challenges – Trigger table entries
Single trigger table shared by all servers.
As a result, can’t have:
• OS-specific paths in table.
• OS-specific binaries in table.
• OS-specific references in triggers.
20. 20
Challenges – Trigger table paths
Can’t use OS-specific paths in table.
Solution = %serverroot% var + make sure tools (Perl,
Python, etc.) are present in base system path.
Old =
C:python27python.exe C:bintrigger.py args
New =
python2 %serverroot%/triggers/trigger.py args
21. 21
Challenges – Trigger table binaries
Can’t use OS-specific binaries in table.
Solution = Wrappers and/or rename via client.
Example: p4auth_ad.exe AD auth trigger.
Old =
C:binp4auth_ad.exe args
New =
%serverroot%/triggers/p4auth_ad.exe args
Linux = p4auth_ad.pl gets synced as p4auth_ad.exe
22. 22
Results
Speed = Syncs, checkpoint/DB rebuild process, p4todb
rebuild are all faster on Linux servers.
Load = Linux edge server for build farm handled 200+
“sync –f” at one time.
23. 23
Summary
Time
• It will take longer than expected.
Effort
• It is easier than expected.
Work with support.
Tableau started in 2003, is another “Stanford spinoff” company. Started in Seattle, now has a number of offices. Has grown very quickly.
About me: Have used and administered Perforce for 10+ years at multiple companies. At Tableau I work on the “continuous delivery systems” team that manages Perforce, TeamCity, Artifactory, and similar systems.
This is one of two presentations from Tableau staff at the conference.
Why this presentation: Have attended many Perforce conferences, and always like to see presentations where people are doing unusual things with Perforce.
Also use other Perforce products: P4Web, p4todb, Swarm, GitFusion, GitSwarm (soon).
Many other systems connect to Perforce: TeamCity, ReviewBoard, OpenGrok, etc.
Other presentation has some examples of how data in Perforce is being used.
Main offices in WA (Seattle, Kirkland) and CA (Palo Alto). Also have offices in Austin, Vancouver (CA), and UK/Germany (HyPer).
“best” solution = Perforce generally works best on Linux. Linux is the most widely used platform for Perforce, it’s what gets the most dev attention.
Helps that we have IT staff that understand Linux.
No monthly reboot for updates.
To last point, want to note that I’ve administered Perforce for many years on Windows, and some of the systems were reasonably large at 2500+ users, so I don’t really have any issues with running Perforce on Windows.
Tech writer said I needed to put some graphics in the presentation…
RO replicas to back up office edge servers are located in remote offices for reporting and DR.
Edge servers = 3x for offices, 2x for build farms.
Not shown: RO replicas to back up edge servers are located in offices (on same host as proxy).
At previous job (10+ years back) wrote Perl toolkit to do Win to Solaris migration. It can be done, but it’s a lot of work.
Have played with p4migrate, but haven’t used it to convert a production system. Think it only does file metadata, so you might still need to fix user names, client names, etc. Know it doesn’t like unloaded clients, there may be other limitations.
Our process:
Start with Linux edge server for build farm (stress test) and RO replica for p4todb/reporting.
Next set up edge servers to replace dev office fwd replicas.
Final step is to migrate commit server.
Tableau is not the only site using this approach.
Stages = If you have large enough metadata, the migration scripts can take hours or days to run, and you have to do EVERY server at the SAME TIME – which can be an issue when you have 5+ servers to convert all at once. Tableau (like other companies) is doing “continuous deployment” where we release updates on a regular cadence. Shutting down the SCM system for days is a difficult sell.
IT standard = need to use this on all “production” servers unless there’s a really good reason not to.
Systemd = A year+ when I started working on this project, there weren’t many references to running P4D with systemd. Now there is the KB article and the SDP has example.
Mention OOMKiller setting.
RHEL7 has full support for XFS now.
Another challenge for RHEL7: We also run GitFusion, which doesn’t (at last check) support https:// on RHEL7, but it can be set up manually.
Cvs2p4 is not the “p4convert” CVS import tool that Perforce has now.
There’s a KB article about the proxy not working right with CVS 1.1.1.1 format archives too.
smbclient has option to automatically lowercase all files it transfers.
Why not use smbclient to transfer all archives? Because we have a few files that have extended chars in the names, and smbclient won’t convert the chars but “verify –t” will. (Probably a Win vs. Linux codepage issue.)
CVS stores binary files inside ,v files, so even binaries had to have dos2unix run over them.
Iterated across each change via “p4 –s verify –qtz //…@1,@1”, etc., to make sure everything gets transferred.
When using verify –t to transfer files, don’t forget spec depot (no changes or lazy copies there) and unload depot (-U).
Depot dirs = Fortunately, we did not have hard-coded paths in the depot definition which would have made this harder.
260 char max path = now fixed with lfn configurable.
What about running from depot?
Won’t work for us because we have support files that need to be alongside trigger.
Also wouldn’t work during period when both Win and Linux servers are running side-by-side.
OS-specific references = Can’t have “C:\logs” (or similar) hard-coded in trigger scripts. Can’t (or at least shouldn’t) assume specific user context.
Perforce server on Win will work with Unix path separator (forward slash).
Linux doesn’t care about extensions, so as long as file has +x setting and #!/bin/env perl (or similar) line at start, you can call a Bash script with a .bat extension or a Perl script with a .exe extension.
Similar problems with Swarm trigger (VBS vs. sh) although that’s been replaced by a single Perl script now.
Not going to put up a bunch of spreadsheet numbers – not really an apples to apples comparison, since Linux servers are better spec’d hardware (and are using XFS) and as we were doing this transition the codebase was getting smaller (moving third-party items to Artifactory) – but the number of build agents was growing.
Time = Other projects/issues/interruptions, time to migrate users from fwd replicas to edge servers, moving around systems during process, adding another edge server, etc.
Effort = Old way with checkpoint surgery is a LOT of work. (One of Ed’s original tasks when he was hired 6 years ago was to do conversion to Linux, and it was just too much time.)
Support = Lots of questions during this process – systemd, issues with integrity logs, etc.
Side-effect of change: All Linux hosts are identical configuration in terms of hardware, drive layout, OS, scripts, etc., which makes it easier for a team to support. Win hosts were built up over several years, so they’re not in sync.