Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally link all apparent changeset references to BitBucket #102

Merged
merged 2 commits into from
Jun 25, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ It's probably easiest to install the dependencies using Python 3's built-in
you are running this script, and the GitHub comments
will be already under your name.

--link-changesets Link changeset references back to BitBucket.

$ python3 migrate.py <bitbucket_repo> <github_repo> <github_username>

## Example:
Expand Down
32 changes: 25 additions & 7 deletions migrate.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,11 @@ def read_arguments():
),
)

parser.add_argument(
"--link-changesets", action="store_true",
help="Link changeset references back to BitBucket.",
)

return parser.parse_args()


Expand Down Expand Up @@ -396,7 +401,8 @@ def convert_comment(comment, options):
"""

def format_issue_body(issue, options):
content = convert_changesets(issue['content'])
content = issue['content']
content = convert_changesets(content, options)
content = convert_creole_braces(content)
content = convert_links(content, options)
content = convert_users(content, options)
Expand All @@ -418,7 +424,8 @@ def format_issue_body(issue, options):
return header + content

def format_comment_body(comment, options):
content = convert_changesets(comment['content'])
content = comment['content']
content = convert_changesets(content, options)
content = convert_creole_braces(content)
content = convert_links(content, options)
content = convert_users(content, options)
Expand Down Expand Up @@ -498,7 +505,7 @@ def convert_date(bb_date):
raise RuntimeError("Could not parse date: {}".format(bb_date))


def convert_changesets(content):
def convert_changesets(content, options):
"""
Remove changeset references like:

Expand All @@ -507,10 +514,21 @@ def convert_changesets(content):
Since they point to mercurial changesets and there's no easy way to map them
to git hashes, better to remove them altogether.
"""
lines = content.splitlines()
filtered_lines = [l for l in lines if not l.startswith("→ <<cset")]
return "\n".join(filtered_lines)

if options.link_changesets:
# Look for things that look like sha's. If they are short, they must
# have a digit
def replace_changeset(match):
sha = match.group(1)
if len(sha) >= 8 or re.search(r"[0-9]", sha):
return ' [{sha} (bb)](https://bitbucket.org/{repo}/commits/{sha})'.format(
repo=options.bitbucket_repo, sha=sha,
)
content = re.sub(r" ([a-f0-9]{6,40})\b", replace_changeset, content)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

been a while since I worked with regex's... what does the {6,40} here do? Is it replacing characters 6 through 40, or inspecting characters 6-40 or only matching strings of length 6-40?

Copy link
Contributor Author

@nedbat nedbat Jun 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[a-f0-9]{6,40} means 6 to 40 hex characters. This is looking for a hex string between 6 and 40 characters long, preceded by a space, and ending at a word boundary. Without the space, there were some false positives, but this seemed to work well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to improve the heuristic by insisting that if the sha was shorter than 8 characters, it had to have a digit in it, so that English words wouldn't get linked.

else:
lines = content.splitlines()
filtered_lines = [l for l in lines if not l.startswith("→ <<cset")]
content = "\n".join(filtered_lines)
return content

def convert_creole_braces(content):
"""
Expand Down