Thursday, January 12, 2012

Svn2Svn – Copy and Sync Between SVN Repositories

The Mission

Break a huge and poorly organized Subversion repository into a number of small, project oriented repositories.

The Challenge

  1. The repository is so large that generating a dump is taking forever.
  2. There are build systems connecting to the current repository.
  3. During the migration process, we should not stop developer from committing their code.

Traditional Tools

The typical solution is to use traditional subversion dump and filter tools. But that doesn’t work for us due the the size of the repository. Dump takes long time and filter into many many small repositories take weeks. In the mean time, developer need to continue to commit and build system must continue to work. Hence we need a solution that enable us to migrate gradually.

Replay Tool

The solution to the problem is a subversion replay tool. The idea is to pull the revision history of a subtree in source repository, replay it back to the destination repository. New commits to the source repository can be incrementally copied over. It should work like svnsync but for a subtree instead of entire repository, and you don’t have to start from revision zero. The benefits of such tool are:

  1. Projects can be moved one by one or a few at a time.
  2. Each project can move according to its own agenda.
  3. The build system can switch to new repository first while developers still committing to old repository.

Why Another Svn2Svn

There was already such a tool available at http://svn2svn.codeplex.com. That tool has its own advantages but doesn’t work for us because it

  1. Does not preserve the author and date/time of the revision.
  2. Does not copy node properties, e.g. svn:ignore, svn:external.
  3. Does not handle copied or moved nodes. This is particularly an issue for tags.
  4. Fails at a few other edge cases

So I set out to completely redesign and rewrite another Svn2Svn.

The New Svn2Svn

The new Svn2Svn is now born. It runs on Windows platform and is based on SharpSvn. It fulfills above requirement and has below features.

  • Support both command line and Windows UI.
  • Copies change sets from one SVN repository to another.
  • Supports non-rooted path (subtree) for both source and/or destination.
  • Doesn't require source and destination to have same path.
  • Copies node properties so that properties like svn:ignore and svn:external are preserved in the destination repository.
  • Copies revision properties so that author and date/time of revisions are preserved in destination repository. [1]
  • Doesn't require to starting from zero revision.
  • Can optionally specify a source revision range.
  • Supports move and copy in addition to add/delete/modify. A tag in source is copied as a true tag in destination with accurate copy from path and revision.
  • Able to gracefully stop the process in the middle.
  • It automatically resumes from a previously stopped session, either manually stopped or due to an error.
  • Able to auto-resync and incrementally copy over new change sets from source. [1]
  • Properly handle rename that only changes the case of letter. E.g, rename “Abc” to “abc”.
  • Intuitive to use. When fails, you get detailed error message.
  • In most of time, you are able to resume from failure by just deleting your working directory and restart with the same parameters.

Note [1] features require destination repository support revision property editing.

Download and Installation

Svn2Svn is a 32 bit .Net application packed in a zip file that contains both Windows UI and command line executable. The distribution is bundled with SharpSvn DLLs, so you can easily unzip to any folder you like and run one of two executable files. They run on 64bit system as 32bit applications.

To obtain Svn2Svn, downloaded latest version from http://code.google.com/p/kennethxublogsource/downloads/list.

Windows UI

image

Command Line

C:\Svn2Svn>Svn2SvnConsole.exe
Usage:
        Svn2SvnConsole.exe [options] sourceUri destinationUri workingDir
options:
        -R:from:to Copy revisions specified by from and to.
        -R:start   Copy revisions from specified start to HEAD.
        -R::end    Copy revisions from 0 to specified end.
        -X         Do not copy any revision property.
        -X:[ADR]   Do not copy one or more revision properties. e.g.:
                   -X:D  - Do not copy date/time revision property.
                   -X:AD - Do not copy author and date/time revision property.
        -I         Ignore all none fatal errors. Only log them.
        -V         Log every revision.
        -V+        Log every revision and node.

Using Svn2Svn

Svn2Svn UI and command line provides same set of features, we’ll use command line as example to show the usage.

Destination Repository Configuration

Some important features of Svn2Svn requires the destination repository to allow revision property editing. Very often you’ll get below error message:

Error Processing Revision
Repository has not been enabled to accept revision propchanges;
ask the administrator to create a pre-revprop-change hook

This is because the destination repository wasn’t configured to allow revision property editing. To enable revision property editing, you need to provide pre-revprop-change hook. And make sure the hook will allow the user who is running this tool to edit any property. The Svn2Svn zip package also enclosed a file named “pre-revprop-change.exe” in the “Misc” folder. It is very useful if you use a local repository as your destination. The pre-revprop-change.exe allows free editing of revision properties by anybody and it prevents a DOS window from being popped up on every commit, which happens when you use a .bat or .cmd based hook.

Typical Usage

The tool is quite intuitive to use. In addition to source and destination URI which are very obvious, it needs a working directory to manipulate the local changes. Below is a typical example:

Svn2SvnConsole.exe http://sourcerepo/svn/project1 http://destinationrepo/svn/ c:\temp\project1

This command can be scheduled to run many times. Every time it runs, it will bring new commits from source to destination. Be careful that you should not make conflicting changes in the destination repository if you intended to run the same command again to bring new commits from source.

Specify Revision Range

If you’ll need to specify (or have to specify) the range of revisions to copy over to destination, use the –r option. You can specify both starting and ending revisions or just one of them. If only one is specified, the other will take a default value. The default value is zero for the starting revision and HEAD for the ending revision. Command below copies from revision zero to revision 678.

Svn2SvnConsole.exe http://sourcerepo/svn/project1 http://destinationrepo/svn/ c:\temp\project1 –r::678

Control Copy of Revision Property

There are times that you have no control over the destination repository so you cannot change the revision property. In this case, you’ll need to disable copying of the revision property by using –X option.

You can also selectively disable copying of specific revision property by using option –X:  immediately followed by any combination of letter A, D and R. For example, option  –X:AD disables copying of author and date/time.

  • A – Author, svn:author
  • D – Date/time, svn:date
  • R – Source revision, svn2svn:revision

Logging Control

Use –V option to have the tool output message for every revision copied. Use -V+ option to further output the detail process information of each node.

Error Handling

Whenever a non-fatal error occurs, the tool will display a title followed by the error message and give you options to fail, retry, ignore or ignore all.

  • Fail will stop the process, print out the full stack trace. Command line tool will exit with non-zero error code.
  • Retry will attempt the same operation again.
  • Ignore will leave the problem as is and move on to next operation.
  • Ignore all will do the same as “ignore” and additionally have the tool automatically ignore all further errors with same title.

The command line tool also support option –I to automatically ignore all non-fatal errors. Non-fatal errors are those errors related only to a specific revision or node.

Interrupting the Process

Windows UI tool turns the “Copy” button into “Stop” button when the copy is in progress. You can click on Stop button to stop the process. Command line utility supports Ctrl-C to stop the process, as soon as you press Ctrl-C during the copy process, the tool will ask you if you want to gracefully stop the process or fail immediately. You should almost always gracefully stop the process by pressing ‘Y’ key. The fail immediate option is provided only for the cases when the graceful stop doesn’t work. If that happens, press ‘X’ key to exit immediately. If you pressed Ctrl-C by mistake, you can press any key other than ‘X’ and ‘Y’ to resume copy process.

How Svn2Svn Works

Every time you run Svn2Svn, it goes through below steps:

  1. Determine the highest revision of the source if end revision wasn’t specified.
  2. Create the destination directory if one doesn’t already exist in repository.
  3. If working directory doesn’t exist, check out the destination to the working directory. Otherwise, do svn cleanup, then revert back pending changes if any and delete all non versioned files and directories, finally do svn update.
  4. Scan through revision history of destination to detect the previous copy information by retrieving the revision property named “svn2svn:revision”. And establish the source revision to destination revision mapping for anything that was previously copied.
  5. Loop through source revisions a hundred at a time to avoid http timeout. Skip through those already copied revisions using revision map collected in the previous step.
  6. If this is the first time copying, i.e. there is no revision mapping discovered in step 4, the tool exports the nodes of first source revision, including their properties, to working directory and commit the changes to destination.
  7. For each subsequent revisions, looping through all node in the change set and perform below operation:
    1. Delete, modify or add/copy the node.
    2. Delete existing properties from node and copy the node properties from source.
  8. If a letter case change only renamed is detected, the add/copy operations are all deferred. The tool executes an additional commit followed by deferred add/copy operations.
  9. When every nodes are done, commit to the destination repository with the same message from source.
  10. Edit the revision repository according the the options specified. You typically want the author and date/time copied over.
  11. Add “svn2svn:revision” property to the destination revision, with value being the source revision number. This information is essential to incremental copy the change set.
  12. Go back to step 6 and repeat the process until the end revision is reached.

Misc. Tools

The Svn2Svn package also include addition stuff in Misc sub directory. One is the “pre-revprop-change.exe” file to enable the revision property editing that we have discussed. Another is a zip of empty Subversion 1.6 repository for you to easily create an clean 1.6 repository. Simply unzip it to any directory you choose.

Known Issues and Workarounds

Authentication

At this moment, the tool doesn’t have build in authentication mechanism. The work around is to use any other SVN client to authentication with the source and/or target and make sure you save the password. I have used TortoiseSVN and standard SVN command line client with success. If somebody have a patch to enable authentication, I’m more then happy to put it in but I don’t have a need for this.

Error in Working Directory

From time to time, you may encounter errors. Some of them are due to a bug in this tool, a bug in SVN itself or nature of case-sensitively difference between Windows vs. SVN repo. But so far most of problems can be solved by simply deleting the working directory (delete the directory itself, not just the content in it), and rerunning the same command again.

36 comments:

Tubbys said...

This tool is good but doesnt work if we have different authentications on source and destination servers

Kenneth Xu said...

@Tubbys, the tool does work with different authentication on source and destination. But you must use any SVN client to per-authenticate with both source and destination and save your password.

I'll see if I can enhance the tool to prompt for username and password if one isn't saved.

Thanks for your comment!

Tomasz Sawicki said...

Great tool. Work like a charm. Thanks for sharing!

Anonymous said...

I am getting the below error -

19.03.2012 17:58:56
SharpSvn.SvnWorkingCopyException: The path 'C:\username\FW.Core' appears to be part of a Subversion 1.7 or greater
working copy. Please upgrade your Subversion client to use this
working copy.
at SharpSvn.SvnClientArgs.HandleResult(SvnClientContext client, SvnException error, Object targets)
at SharpSvn.SvnClientArgs.HandleResult(SvnClientContext client, svn_error_t* error, Object targets)
at SharpSvn.SvnClient.CleanUp(String path, SvnCleanUpArgs args)
at SharpSvn.SvnClient.CleanUp(String path)
at Svn2Svn.Copier.PrepareWorkingDir() in L:\OpenSource\Svn2Svn\Svn2Svn\Copier.cs:line 147
at Svn2Svn.Copier.Copy(Int64 startRevision, Int64 endRevision) in L:\OpenSource\Svn2Svn\Svn2Svn\Copier.cs:line 87
at Svn2Svn.CopyForm.DoCopy() in L:\OpenSource\Svn2Svn\Svn2Svn\CopyForm.cs:line 85
19.03.2012 17:59:00

Kenneth Xu said...

@Anonymous, you obviously touched the working dir with your 1.7 client. Delete the working dir and try again.

Sebastian said...

I'm getting the same error:

..."appears to be part of a Subversion 1.7 or greater working copy. Please upgrade your Subversion client to use this working copy."...

Could you please build a new version with the 1.7 version of the sharpsvn dlls included? I'd really like to use your tool.

Thx in advance.

Kenneth Xu said...

@Sebastian, you can do one of two things:

1) delete you working dir and run the app again. It will resume and run fine as long as you don't touch the working copy again with 1.7 client. You can also point it to a new empty folder as working dir.

2) Download the 1.7 version of SharpSvn and use assembly redirect.

Mike said...

Hi Kenneth,

first thanks for this tool, sounds really promising. However, still facing issues to see whether it can hold up to that promise ;-)

At the moment, I'm having same issues as Anonymous above (regarding 1.7 client). It's possibly due to the new working copy format introduced in 1.7.
However, I wasn't able to rollback the format - and simply deleting the working dir as you suggested gives the error, that the specified folder is not a working dir folder :-/
Or did I misunderstand sthg in your suggestion here?

Would it work if I were to download/use an older client again (e.g. TortoiseSVN 1.6?)

Looking fwd to your comment!

Kenneth Xu said...

@Mike, when I said delete the working dir, I meant to delete the folder, not just the content inside it. For example, if you specify working dir as C:\Temp\myworkingdir, then C:\Temp must exist, but myworkingdir must NOT exist. This was explained briefly in step 3 of the "How Svn2Svn works" section in the original post. But I'll try to make it clearer. Thanks for the feedback and let me know if that fixes your problem.

Mike said...

@Kenneth, thanks for the clarification. It ran through now. (When the folder didn't exist, it took very long though (which isn't the bad thing), but I would have wished there would have been log feedback (log was just empty for a couple of hours, even on "Detail"))

I also tried your second suggestion (assembly redirect), which also seems to work (and which I prefer as solution)!

Now I just need to see whether the tool fits my use case ;-)

Keep up the good work!

Kenneth Xu said...

@Mike, glad to know that and it seems that you have a huge destination repository. I guess the the app doesn't log anything when it is checking out to working dir. I agree this is something that can be improved. Thanks for your feedback. And feel free to open an enhancement request on the Google code where you downloaded this binary.

Mike said...

@Kenneth, thanks for the comment.
My repository indeed is quite huge. Unfortunately, on one of my machines SVN2SVN simply hangs after 20 mins or so (i.e., the UI becomes unresponsive, but CPU continues to consume ~25 %). However, doesn't seem that just the UI is unresponsive, as it continues to run for days and never finishes (I hoped that it might still do the actual work in the background, but it should have been finished by now).

Have you by any chance experienced such unresponsive UI behavior?

Cheers!

Kenneth Xu said...

@Mike, the app uses svn client to commit the changes to your destination repo. When you have huge repo and working dir, the commit is going to take long time. When I say huge, I mean you keep a lot of projects with a log of branches and tags. I create this utility is for me to break a huge repo into multiple small ones so I didn't have that problem. I don't know you use case, but in general, having a huge SVN repo is alarming.

Cheers!

Mike said...

@Kenneth: I see, it indeed takes a long time, but it ran through for my first (though smaller) folder *yeah*

Unfortunately, the second (and bigger) folder fails after some time of execution with following error:
"Commit failed (details follow): --> Can't open file 'H:\SVN_local_path\db\txn-protorevs\848-sp.rev': The file exists."

(In fact the file exists, but I have no clue what the specific problem is here. Also, disk space is still enough (in case it's a writing problem)).

Did you by any chance encounter that message before?

Kenneth Xu said...

@Mike, not too sure, but I would try delete the txn-protorevs folder in working dir and let it resume.

Modjo said...

I'm trying to use svn2svn, it works fine for everything except that it skips revision with modifications only. Any idea about what is going on ?

Kenneth Xu said...

@Modjo, I didn't experienced this problem and it process all revisions fine. Can you clarify what exactly do you mean by modifications only?

Modjo said...

Actually, it commits the revision only when there are created/deleted files. Whenever there are updates to some files it does not process it. when there are no created/deleted files it just skips the revision.

Kenneth Xu said...

@Modjo, modificaiton only is such a common use case I bet everybody use this tool must encountered. I have used this to move a number of projects out of a huge repository and had never experience such problem. Can you show up some log? I'm wondering if it is not seeing the modificaiton or not committing.

andy.gikling said...

Kenneth,

This seems like an amazing tool if it works as described! However, how do I pass in a user name and password exactly? My destination and source repos have the same username in this case.

Above you said it was possible via authenticating with some SVN client? I use SmartSVN. How is this done? Do you have the username fixes done in the program yet?

Thanks again

Kenneth Xu said...

@andy.gikling, I used TortoiseSVN to authenticate with my source and target by checking save password box. I belive other SVN client should have similar feature. And for most of my use, I copy to an empty target repo (provided in the zip file) and then I hand over the repo to the admin to import into target repo.

DFredericks said...

I like the idea of your tool, but i got this error:
sharpSVN.SVNAuthorizationException: OPTIONS of 'svn https path' authorization failed: Could not authenticate to server: rejected Basic challenge (https://server path) at.......

Do you have any idea why this would happen, and what to do about it? I created a new repo on the server, and then just added using tortoiseSVN that code, and in doing that, the cert was verified, then i added username and password.
Is there something I missed?

thanks
Dan

Søren H. said...

Works fine, thank you, except when a filename contains a #. That produces a URL doesn't exist error.

woecki said...

great stuff!!
i only found that it doesn't work if the source directory was renamed. if e.g. a path was http://svn/repo/foo but renamed to http://svn/repo/bar, svn2svn is not able to copy it ("URL 'http://svn/repo/foo' non-existent in that revision" at the revision# where "foo" was created). probably because svn2svn is using the "export" functionality that doesnt work either in such situations. unfortunately our repos are filled with renames. any ideas how to workaround?

kevinl said...

works fine except i get log messages saying that certain files already exist. Is there a way to allow overwrites in the working directory so that my end result is the same as my old repo?

kevinl said...

after further use of this tool, it looks like it doesn't import all revisions. I don't know what's going on, but when comparing the two repos side by side, the revision details are not 100% on the new repo.

Kenneth Xu said...

@woecki, are you coping entire repo or just just repo/bar? can you give sample command line?

@kevinl, it seems that your working folder is corrupted, read "Error in Working Directory" section

kevinl said...

Kenneth, I tried re-running the same command after deleting the entire working directory, but it still does not work. For example, I had a revision that changed folder1/folder2/maps.file to folder1/gps_map/maps.file, and the program does not like that.

kevinl said...

I am looking at a revision number on my old repo, but when I try it on svn2svn, I get the following error: No matching destination revision for source revision number 5833

kevinl said...

okay i found out the error with this tool. it is not picking up any modifications within the revisions... only additions and deletes.

For example, my revision on the old repo has 50 changes (30 additions, 20 modifications). Using the import tool, the revision on the new repo has only 30 changes (which are the additions, but did not detect the modifications).

kevin said...

the way to bypass this issue was to do the following:

1) in the old repository, commit any random change (possibly just 1 file)
2) with the tool, update the "From" revision field to the latest revision on the old repo
3) the new repo will have the random committed file as well as pretty much all the modified files that did not make it into the new repo.

woecki said...

i try to copy just repo/bar (which was repo/foo before a svn mv operation).
i have to try if running svn2svn on repo/foo with "-r from:to " + separate run on repo/bar with "-r from" can act as workaround. will take some time, but i'll let you know.

Anonymous said...

Hi

Thanks for this tool.
Looking at the log printout it seems to work just fine.

But I get the same regarding missing revisions:
The create revision of a file is matching but all modification revisions are missing.

Anonymous said...

And I only get the FIRST checked in version of the file

Anonymous said...

Hi,
Thanks for this tool - was very useful. I have made a number of fixes and enhancements that mean that it rarely errors now (just migrated a set of projects across our 20 man team with no errors).
Are you interested in receiving patches?

Kenneth Xu said...

sure, can you create issues in the googlecode and attach your patch there?

Post a Comment